Jon Michael Galindo

~ writing, programming, art ~

<< Previous next >>
17 March 2017

Training AI

I have had a few thoughts on artificial intelligence while reading Michael Nielsen's free online book about neural networks and deep learning. I thought I would compose a short post summary, and then append some of my own thoughts on how game developers might use some of the concepts undergirding neural nets.

Neural nets in a blurb: A neural network contains a list of numbers, weights and nodes. Running it requires a list of numbers, inputs. It works by multiplying the inputs via internal structures, weights, and activation functions. It produces, as an output, a list of numbers. It is general-purpose because certain versions of this system can be shown to equate to the building blocks of any deterministic information process.

Data training in a blurb: The real bread-and-butter here is gradient descent. Subtract the output from an ideal "training" output to create an error value. Calculate the partial derivative of each weight in a net, then calculate its partial error. Adjust that weight a tiny distance along its gradient away from the error. Select a new training input/output and repeat. Over time, the net begins to produce output similar to the training output if and only if that output can be produced by a deterministic information process requiring less computation than that required to compute the net.

The concept is antithetical to programming. It is more like writing a multiple-choice test, and then having a student (who cannot read the test's language) learn the correct responses by repeatedly completing and being graded on that test. It comes with some caveats, but in practice it enables the deployment of information processes no programmer has ever managed to design. The structure of the software must still be programmed intuitively, but because it can be represented discretely and is highly modular, genetic algorithms can mix-and-match variants to produce results similar in quality to a programmer's efforts.

That is all the important, real stuff: now for some of my more trivial, game-related thoughts. I would like to focus on the training aspect, because I believe it provides game developers the opportunity for emergent behavior.

Any in-game AI designed with modular weights in some kind of decision tree can then be trained against a reward function using some mathematical equivalent of gradient descent. In the process, very curious behaviors tend to arise. These lend games a quirkiness very atypical in simple paths or decision trees, but these quirks are also very structured, completely unlike PRNGs.

Of course, quirkiness itself is not necessarily good: It might mean standing by a wall indefinitely instead of walking around it. Nevertheless, I treasure the possibility. In-game characters need quirkiness amidst effective structure to evoke a sense of person. It might afford games something that they do not currently contain. It might enable a very personable behavior, without the need for true intelligence. Repeatedly adjust the AI's priorities and train it against its world to see if any interesting behavior can be achieved, and then hard-code that version of the AI into the final game. Of course, the ideal would be to create a system that can continue to develop new behaviors without losing its best attributes, but that is probably impossible without human review. Regardless, I will continue to experiment with the idea to see how it can be used in various game styles.

© Jon Michael Galindo 2015-2018