The TORCS Racing Board
Username: Password: Remember Me?
Lost Password Register
Author: timfoden | Created: 2019-11-26 14:57:21
Subject: Found a good derivative-free optimiser -- thought I'd share
For a while now I've been using a python script to optimise the parameters in my robot's configuration file. While I was developing this script I tried a number of the documented "global derivative free" optimisation algorithms like:

- Simulated Annealing (too slow, not directed enough, too difficult to configure)
- Nelder-Mead (n-simplex) search (doesn't seem to cope well with the "noisy" nature of the search landscape)
- Gradient-descent (only works well in the early stages of the search where the step size is large.)
- also (grid search, simplex subdivision, HOO)
- others I can't remember.
- Tabu search combined with gradient descent (this is what I ended up with, but it gets really slow.)

Well recently I was reading some Google machine learning paper (sorry right now can't find a reference for this -- edit: found it: https://worldmodels.github.io/) about getting the machine to play some simple games given the same view of the screen a human would have, and one of the things they used was an optimisation algorithm in it called CMA-ES (or Covariance Matrix Adaptation Evolution Strategy) -- see: https://en.wikipedia.org/wiki/CMA-ES

This is a black-box real-valued search algorithm that can cope with a noisy objective function and a high number of dimensions.

Once I fit it into my search framework and normalised the input state given to CMA (the docs say this should be done, but initially I didn't) it seems to work really well. Only been playing with it for a few days, but thought I'd share.

Last thing I set off (last night) had 128 parameters to optimise, and it looked like it had done really well by the time I left the house this morning (using 4 cores on my 4970K). The report from the google paper implied that it could handle around 3000 parameters if needed.

Source code in a variety of languages (including python which is what I'm using) is available from the author's web-site http://cma.gforge.inria.fr/
Last Edited: 2019-11-26 22:32:25 by timfoden
    Author: dummy | Created: 2019-11-26 18:26:45
    Subject: Re: Found a good derivative-free optimiser -- thought I'd share
    This all sounds very interesting, thanks for sharing. I have to admit, that I don't know anything about optimisation algorithms or machine learning. Mouse seems to be much more advanced than anything I could come up with my pay grade.

    I have a feeling that I couldn't compete against you even when using Mouse myself. This has more of a science project than I could have imagined. It's time for me to start learning new things or get left behind ;)
    Last Edited: 2019-11-26 18:26:45 by dummy
      Author: timfoden | Created: 2019-11-26 21:21:05
      Subject: Re: Found a good derivative-free optimiser -- thought I'd share
      Well I certainly don't mean to disenhearten anyone. I actually thought (admittedly from one of Andrew's old comments) that we all use some form of machine based parameter optisation (in the .xml files or other robot config files), so that's why I posted this.

      For myself I've always felt that the racing lines generated by the plain clothoid path algorithm are somewhat sub-optimal and am always looking out for ways to try to improve it. Nothing really new on that from this year in mouse though, except for the .spr files, which to be honest were slightly machine optimised (via the curve factor values in the .xml files -- same as used by mouse in the 2017 season) cloithoid paths with some manual tweaks on top only.

      It's always time to learn new things though isn't it? :)
      Last Edited: 2019-11-26 21:22:38 by timfoden
        Author: firechief | Created: 2019-11-27 01:57:04
        Subject: Re: Found a good derivative-free optimiser -- thought I'd share
        I agree with Danny, this stuff is over my head too :)

        I did implement a rough form of simulated annealing into a version of USR in order to try and auto-configure it for tracks. As you said Tim it's too slow.

        I'd run it with a version of Torcs that had no tire wear or fuel consumption, and that auto-repaired damage after a few seconds, and was simply looking to tune the speed at which corners could be taken & how late to brake. Even so it took many tens of thousands of laps, often to come up with values that were within a 10th of a second (plus or minus) of what I could come up with manually in way less time. Eventually I gave up on the idea.

        Yep it's always good to learn something new - I hope one day to have the time and energy to get into this :)
        Last Edited: 2019-11-27 01:57:46 by firechief
          Author: timfoden | Created: 2019-11-27 10:36:32
          Subject: Re: Found a good derivative-free optimiser -- thought I'd share
          Well, if you do manage to set up your parameter optimisation to use CMA-ES instead of simulated annealing I'd pretty much guarantee that it'd be a *lot* faster to get good values.

          For me it was fairly easy to set up as I already had a python script that iteratively writes a new robot config file, runs up torcs in practice mode (headless) and then extracts the average lap time from the results file. Thus I can easily write a python function which given a set of parameter values can calculate the average lap time. This fits well with using the python CMA-ES code, as the procedure is to ask the CMA library to optimise, and you pass it 3 things:

          1. a function that takes a set of parameter values and returns the evaluation (in my case the average lap time.)
          2. a vector of initial parameter values.
          3. a global initial step size (this is a single scalar, kind of used like a radius around the initial parameter values, which is why the docs ask for parameter values that all have the same scale.)

          e.g.:
          x, es = cma.fmin2(None, normalised_state, 0.1, parallel_objective=cma_evaluate_states)

          For you it sounds like you're making changes on a lap by lap basis within your robot code while running torcs, so there may be an inversion of control issue. Perhaps would need to run the optimiser in another thread and feed values to/from it on a queue. Or dig a little into the lower level API and find a way to do it demand driven from the robot code (I think this should be possible, just didn't need it in my case.)
          Last Edited: 2019-11-27 10:47:13 by timfoden
            Author: firechief | Created: 2019-11-27 23:33:02
            Subject: Re: Found a good derivative-free optimiser -- thought I'd share
            Hey, that actually makes sense - I'm definitely going to do this. For next year though, no chance of implementing anything for the last race. I guess you'll be 2 seconds a lap faster rather than just one :D

            In my in-robot implementation, it'd run for 3 laps to gain an average, then change the parameters & recalculate the cornerspeeds etc before running the next 3 laps, and so on. I actually had the track divided into sectors so it could time how long it took to complete individual portions of the lap, which meant it could be evaluating different parameters in parallel to each other. Even then it was so slow to come up with anything useful.

            Your external python method is a lot simpler and cleaner.

            Edit: Curiosity got the better of me and I installed the python version of cma. Having browsed through the documentation, which as I feared was not layman-friendly at all and had no examples I could understand, I've kinda hit a wall with it. Yes it has a fully documented API, but I have no idea how to "plug it in" or decipher its output.
            Last Edited: 2019-11-28 10:21:55 by firechief
              Author: timfoden | Created: 2019-11-29 11:02:58
              Subject: Re: Found a good derivative-free optimiser -- thought I'd share
              Assuming that you've got it installed (I just did "pip install cma") so it can be used from a python script with "import cma", and that numpy is also installed (I had it already, but I think the pip install will install it for you as a dependency), here's a short (but complete) example (indentation will be lost due to the forum):

              from __future__ import print_function
              import cma
              import numpy as np
              import math

              def my_function( state ):
              global best_value

              # sum of x * (x - 1), minimum should be at x=0.5 for all dimensions
              # replace this line with code that works out the average lap time
              # by running torcs.
              value = sum(state * (state - 1))

              # keep track of the best value found so far.
              if best_value > value:
              best_value = value
              print( "new best:", state, "-->", value )

              return value

              initial_state = np.array([0] * 10) # search in 10 dimensions
              best_value = math.inf

              x, es = cma.fmin2(my_function, initial_state, 0.1)

              print( "optimised state:", x )

              ------
              for the code to find the fastest lap time I don't expect the fmin2 function to ever return... so I rely on the best so far detection in the evaluation function itself.


              Edit: This page https://pypi.org/project/cma/ also has a couple of fairly simple examples on it.


              Edit2: And the example after "And a little more elaborate exposing the ask-and-tell interface:" shows how to use it in a demand driven way as you'd need to run it from within a robot.


              Edit3: The python help for fmin "help(cma.fmin)" also has some examples in it. Note that the initial sigma0 passed into the fmin2 function doesn't really matter very much, as the algorithm adapts it as it goes along. As I normalise all the state values to be in the expected range of 0..1, I just set it to 1/10 of this, which is 0.1, which appears to do a reasonable job.
              Last Edited: 2019-11-29 15:44:38 by timfoden
                Author: firechief | Created: 2019-11-30 14:00:33
                Subject: Re: Found a good derivative-free optimiser -- thought I'd share
                Brilliant explanation, and thanks for the code pointers too! I really appreciate your sharing this with everyone Tim - personally I expect I'll be able to make use of this for more than just robot optimisation too.
                Last Edited: 2019-11-30 14:00:33 by firechief
                Author: dummy | Created: 2019-11-30 16:07:57
                Subject: Re: Found a good derivative-free optimiser -- thought I'd share
                Yes thank you so much. I hope it will help us to close the gap a little and have more contested races :)
                Last Edited: 2019-11-30 16:07:57 by dummy
        Author: dummy | Created: 2019-11-27 05:07:11
        Subject: Re: Found a good derivative-free optimiser -- thought I'd share
        >Well I certainly don't mean to disenhearten anyone.
        In contrary it's a honor to have you with us.

        >It's always time to learn new things though isn't it?
        That's the main reason I'm hanging around here :)
        Last Edited: 2019-11-27 05:07:11 by dummy
    Author: wdbee | Created: 2019-11-29 16:35:04
    Subject: Re: Found a good derivative-free optimiser -- thought I'd share
    As you know my version is similar to the genparopt racemanager module I made for Speed Dreams.

    But to be honest, there are two show stoppers build in TORCS:

    First the bug that the first driver in qualifying gets a different start configuration. While the optimization it always runs as first driver, while in race there is only one first driver, and Andrew made one of his drivers be the first using the =Names= just in the season the bug raised ;)
    At E-Track-2 I have about one second difference at the moment starting as first compared to starting later (As first the third lap is valid, starting later it is invalid)!

    Second is the SOLID change bug. Often my bot crashes getting NANs as parameters form TORCS. This is because in SOLID they changed the data type resulting in NANs at TORCS side. So I have to restart the optimization again and again.

    From my point of view a smarter algorithm would be nice but does not help, as long as we have these bugs in TORCS.


    Last Edited: 2019-11-29 16:35:04 by wdbee
      Author: timfoden | Created: 2019-11-29 18:01:51
      Subject: Re: Found a good derivative-free optimiser -- thought I'd share
      I work around the qualifying thing by setting up an endurance race with qualifying only (no actual race) and putting my bot in twice. Thus it runs qualifying for both a first and a non-first bot. I scrape the lap time data from the results file. I return as the actual lap time the worst of the two times. The optimiser thus tends to make both times the same.

      I've never heard of this NaN issue.
      Last Edited: 2019-11-29 18:01:51 by timfoden
        Author: wdbee | Created: 2019-11-29 18:39:46
        Subject: Re: Found a good derivative-free optimiser -- thought I'd share
        Hi Tim,

        your way allows to just select another race type. Here the optimization is build in a variant of TORCS (Means I do not have to load TORCS again and again. For a 3 lap race it takes about a second, without the NAN bug this would allow to race 3600 races per hour). It should be possible to rewrite the code to allow more than one driver while optimization, but this would take some time and this is something I do not have at the moment ;).

        I'm happy to have the optimizer as it is, because the setups made by it starting from the default setup at least resulted in some times getting the first pit.

        With help of Danny (valgrind, uninitialized variables) I was able to fix some bugs in my bot. I added a check for NANs in the code and so I can say, it is the TORCS/SOLID bug that makes my bot crash.
        Last Edited: 2019-11-29 18:41:01 by wdbee
      Author: firechief | Created: 2019-11-30 04:41:54
      Subject: Re: Found a good derivative-free optimiser -- thought I'd share
      > Andrew made one of his drivers be the first using the =Names= just in the season the bug raised ;)

      Well, not just that season - am still using it. It was/is just a means of getting the drivers listed together in order to scratch my OCD itch - I don't think it had any effect on their qualifying times. Certainly it hasn't helped me qualify first this year :)
      Last Edited: 2019-11-30 04:41:54 by firechief