Upcoming AI Neural Net Change

Sorian · October 14, 2014

I heard you guys like developer rants, so I thought I would share a change I worked on over this past weekend.

If you have no idea what a neural network is, or how they are being used in PA, then I suggest you watch my GDC 2012 presentation: http://www.gdcvault.com/play/1015667/Off-the-Beaten-Path-Non.

(Scroll on the left hand side until you see my neural network talk. If my talk didn't bore you I also get a lot of questions at the end during the Q&A session.)

My presentation talks about my work on SupCom 2, but I use many the same techniques in PA, in addition to some new ones.

The work I checked in this morning is an attempt to make the AI fully commit its units to a fight in situations where running away would leave the AI worse off. For example, if the AI has some infernos in your base and you bring in units to defend, odds are the AI will back off. This is because, up until now, there has been a static threshold for good decision versus bad decision.

All neural network outputs are in a range between 0.0 and 1.0. Anything below 0.5 means taking that action will result in a less than favorable outcome. Currently, if all possible actions are below the 0.5 threshold, the platoon (AI grouping of units) runs away. However, this does not take into account the fact that the platoon may be dead already and is better off doing what damage it can before it dies. This is where the new stuff comes into play.

Instead of treating running away as the absence of any valid action it is now an output of the neural network. It is trained just like all of the other possible actions so that the AI can learn what happens when a platoon runs away in various circumstances. In the actual game (not training) the run away output is ignored as an action. This is because, in many cases, running away will be seen as the best action due to the fact that neither the AI nor the enemy would take any damage.

Instead of using the run away output directly we use the output to set a new run away threshold. This threshold will be the lower of 0.5 or the run away output. This allows the AI to make the judgment call of staying and fighting because running away would result in more losses for less gain.

Running away is still the absence of any valid action, but the AI now has a way to dynamically lower the bar for what constitutes a valid action.

Geers · October 14, 2014

Welp. Back to getting my arse kicked by a calculator .

icycalm · October 14, 2014

At the risk of coming off as unduly negative, which is by no means my intention, I would just like to point out that PA is quite a lopsided game. On the one hand you have the AI, which by all accounts appears to be of unprecedented quality (haven't used it myself so far, but I am quite confident that the praise it's receiving is well-deserved), and on the other hand you have e.g. the biomes, which, though they do have their strengths (merely the fact that they exist on spherical maps being in itself a major accomplishment), are on the whole quite primitive for the genre. So while it's great to hear that the AI is still being significantly improved, there's a small part of me that would rather you were focusing your energies on the biomes instead, or on any number of other aspects of the game that are underdeveloped by comparison. Of course I realize that people have specialties and can't be reassigned to other aspects of the game at will, but that doesn't change how the irrational part of my brain works when I see update posts like the above.

tl;dr pls ignore the above which I just had to point out and get out of my system, and keep up the good work

Zainny · October 14, 2014

icycalm said: ↑

Running the danger of coming off as unduly negative, which is by no means my intention, I would just like to point out that PA is quite a lopsided game. On the one hand you have the AI, which by all accounts appears to be of unprecedented quality (haven't used it myself so far, but I am quite confident that the praise it's receiving is well-deserved), and on the other hand you have e.g. the biomes, which, though they do have their strengths (merely the fact that they exist on spherical maps being in itself a major accomplishment), are on the whole quite primitive for the genre. So while it's great to hear that the AI is still being significantly improved, there's a small part of me that would rather you were focusing your energies on the biomes instead, or on any number of other aspects of the game that are underdeveloped by comparison. Of course I realize that people have specialties and can't be reassigned to other aspects of the game at will, but that doesn't change how the irrational part of my brain works when I see update posts like the above.

tl;dr pls ignore the above which I just had to point out and get out of my system, and keep up the good work
Click to expand...

Probably the best way to help that would be for other team members of Uber to post on the forums as well about what they're working on so we can all see the diversity of work that is going on (And god help us if biomes aren't getting some massive love).

One thing also worth mentioning is that Sorian says above he worked on this on his weekend -- presumably outside of standard working hours (hope this for your sake guys ). Given his background in AI stuff, may even be a little passion project/feature he's been wanting to do. And when it comes to personal time, anything is fair game even going outside and enjoying the sun

Sorian · October 14, 2014

Yes, this was on my personal time. I was trying to go to sleep Wednesday night (I think) last week and this idea popped into my head.

It kept me up for hours.

This happens more often then I would like.

I don't mind the ideas popping up, I would just like it to be during the day, or really anytime other then when I am trying to go to sleep.

ef32 · October 14, 2014

Does this mean that chances of AI comm running straight into my base/open field alone because I destroyed 3 factories in their base are lowered?

Zainny · October 14, 2014

Hey sorian, sorry if this is a dumb question, so do you have a set of data, scenarios, etc. that you use to train the network ahead of time? What (and I suppose when)are the different inputs going in to the network for training?

I delved in to neural networks for some work I did many years ago building a kohonen neural network for the classification of seismic waves around minesites and for that I was feeding in static waveforms, picking out key features (dominant frequency, spectral envelope, etc.) and evaluating the resulting map clusters. I've since forgotten all that stuff though and am now a lowly iOS/Android dev

Sorian · October 14, 2014

ef32 said: ↑

Does this mean that chances of AI comm running straight into my base/open field alone because I destroyed 3 factories in their base are lowered?
Click to expand...

Nope, that will be fixed in the future by other means.

mabdeno · October 14, 2014

Sounds like your trying to teach the AI how to think a few moves ahead. This alone would make the AI alot less predictable in situations where you would normally expect only one kind of behaviour.

doud · October 14, 2014

sorian said: ↑

I heard you guys like developer rants, so I thought I would share a change I worked on over this past weekend.

If you have no idea what a neural network is, or how they are being used in PA, then I suggest you watch my GDC 2012 presentation: http://www.gdcvault.com/play/1015667/Off-the-Beaten-Path-Non.

View attachment 21952
(Scroll on the left hand side until you see my neural network talk. If my talk didn't bore you I also get a lot of questions at the end during the Q&A session.)

My presentation talks about my work on SupCom 2, but I use many the same techniques in PA, in addition to some new ones.

The work I checked in this morning is an attempt to make the AI fully commit its units to a fight in situations where running away would leave the AI worse off. For example, if the AI has some infernos in your base and you bring in units to defend, odds are the AI will back off. This is because, up until now, there has been a static threshold for good decision versus bad decision.

All neural network outputs are in a range between 0.0 and 1.0. Anything below 0.5 means taking that action will result in a less than favorable outcome. Currently, if all possible actions are below the 0.5 threshold, the platoon (AI grouping of units) runs away. However, this does not take into account the fact that the platoon may be dead already and is better off doing what damage it can before it dies. This is where the new stuff comes into play.

Instead of treating running away as the absence of any valid action it is now an output of the neural network. It is trained just like all of the other possible actions so that the AI can learn what happens when a platoon runs away in various circumstances. In the actual game (not training) the run away output is ignored as an action. This is because, in many cases, running away will be seen as the best action due to the fact that neither the AI nor the enemy would take any damage.

Instead of using the run away output directly we use the output to set a new run away threshold. This threshold will be the lower of 0.5 or the run away output. This allows the AI to make the judgment call of staying and fighting because running away would result in more losses for less gain.

Running away is still the absence of any valid action, but the AI now has a way to dynamically lower the bar for what constitutes a valid action.
Click to expand...

Sorian,
Could you please clarify a few things regarding your Neural Network implementation :
You say outputs are differents type of actions and that for training, you randomly select an action.
Then, in order to tell to the NN it was a good or a bad decision, you compute the fitness function value. And if the fitness function value shows evidence that it was not a good decision you say output nodes are adjusted then adjustment is back propagated. I understand the backpropagation thing. What i do not get is what you mean by "output were adjusted". How do you adjust an output which is an action ? Do i have to understand that a single output node is a combination of action/fitness delta ?

Another question : is Deep learning (applied today to computer vision/recognition) something that could used to build a better AI ? It looks like to me that complex strategical decisions may take advantage of deep learning.

Thanks.

doud · October 14, 2014

sorian said: ↑

Yes, this was on my personal time. I was trying to go to sleep Wednesday night (I think) last week and this idea popped into my head.

It kept me up for hours.

This happens more often then I would like.

I don't mind the ideas popping up, I would just like it to be during the day, or really anytime other then when I am trying to go to sleep.
Click to expand...

Well it might just be that during the day you're fully concentrated on direct actions. And when you're back at home, your own neural network is producing outputs because it now has time to work for itself . I usually experiment the same thing in the evening or early in the morning, drinking my coffee and smoking a cigaret When i was younger, a coding issue was usually fixed in the middle of the night because it used to wake me up and i could not prevent myself from switching on the computer and fixing the issue asap

superouman · October 14, 2014

sorian said: ↑

Yes, this was on my personal time. I was trying to go to sleep Wednesday night (I think) last week and this idea popped into my head.

It kept me up for hours.

This happens more often then I would like.

I don't mind the ideas popping up, I would just like it to be during the day, or really anytime other then when I am trying to go to sleep.
Click to expand...

I know how you feel.

perfectdark · October 14, 2014

LOL at first post ripping an AI dev for not improving terrain.

brianpurkiss · October 14, 2014

WOOHOO!!! This is gonna be a HUGE improvement to the AI.

Up until now it's been easy to defeat the AI by just sending in a large force and the AI runs away. Glad that it will no longer just run away.

draiwn · October 14, 2014

sorian said: ↑

ef32 said: ↑

Does this mean that chances of AI comm running straight into my base/open field alone because I destroyed 3 factories in their base are lowered?
Click to expand...

Nope, that will be fixed in the future by other means.
Click to expand...

I assume it will also fix AI sending commander alone in Astraeus without scouting planet before?

exterminans · October 14, 2014

draiwn said: ↑

I assume it will also fix AI sending commander alone in Astraeus without scouting planet before?
Click to expand...

I don't think interplanetary moves are controlled by the neural net either yet.

I still do see some type of problem though: Running away or not is a very situational decision. When deploying mixed platoons, they tend to spread quite far while executing an attack order. Some units like infernos tend be best left alone to die, while combat fabbers and other second row units actually have a good chance of escaping, so the situation actually differs quite a lot. Not even mentioning dox which are inherently designed around hit&run maneuvers.

Static, mixed platoons are making it somewhat impossible to find an universal solution for the whole platoon at once.

Tell me, would it be possible to implement a split/merge action for platoons? Try to run the NN not only for whole platoons, but for arbitrary classifications / subsets instead. If the outcomes vary to much, split. Or merge with other subplatoons if outcomes match (= same action). That could actually solve quite a lot of problems, such as:

Platoons with similar / identical goals colliding

One platoon retreating while another one stays, even though this now leaves the remaining platoon doomed

Mixed platoons inevitably failing in some situations

Lack of "steamroll" tactics (the AI somehow doesn't seem to understand synchronized attacks yet)

maxpowerz · October 14, 2014

@sorian
Will you be programming in some old school war moves into the neural network,

It would be amazing to see the AI amass a large force and then attempt to use moves like,
Pincer movenment and attack, http://en.wikipedia.org/wiki/Pincer_movement

maxpowerz · October 14, 2014

exterminans said: ↑

Lack of "steamroll" tactics (the AI somehow doesn't seem to understand synchronized attacks yet)

Click to expand...

Also maybe a the AI Neural Network needs an "Are these unit's considered disposable routine"
If steamrolling or rushing is wanted then units need to be deemed as disposable and just thrown at the enemy regardless of odds of winning.

Edit..
Obviously the AI would need to rate the effect of steamrolling with a batch/group of disposable units and only use it if it was beneficial to the outcome of the battle.

nawrot · October 14, 2014

icycalm said: ↑

At the risk of coming off as unduly negative, which is by no means my intention, I would just like to point out that PA is quite a lopsided game. On the one hand you have the AI, which by all accounts appears to be of unprecedented quality (haven't used it myself so far, but I am quite confident that the praise it's receiving is well-deserved), and on the other hand you have e.g. the biomes, which, though they do have their strengths (merely the fact that they exist on spherical maps being in itself a major accomplishment), are on the whole quite primitive for the genre. So while it's great to hear that the AI is still being significantly improved, there's a small part of me that would rather you were focusing your energies on the biomes instead, or on any number of other aspects of the game that are underdeveloped by comparison. Of course I realize that people have specialties and can't be reassigned to other aspects of the game at will, but that doesn't change how the irrational part of my brain works when I see update posts like the above.

tl;dr pls ignore the above which I just had to point out and get out of my system, and keep up the good work
Click to expand...

Sorian is right man at right task, his talents would be wasted on biomes and art. However i agree, planet biomes look ugly and unpolished. If i remember correctly they had only one update since alpha, and just some of them. Hopefully it all gets fixed, someday before the ancients wake up.

silenceoftheclams · October 14, 2014

Sorian, greetings from the mountainhomes. Your efforts are legend there.

Upcoming AI Neural Net Change

Sorian Official PA

Geers Post Master General

icycalm Post Master General

Zainny Active Member

Sorian Official PA

ef32 Well-Known Member

Zainny Active Member

Sorian Official PA

mabdeno Active Member

doud Well-Known Member

doud Well-Known Member

superouman Post Master General

perfectdark Active Member

brianpurkiss Post Master General

draiwn New Member

exterminans Post Master General

maxpowerz Post Master General

maxpowerz Post Master General

nawrot Active Member

silenceoftheclams Active Member

Share This Page

Upcoming AI Neural Net Change

Sorian Official PA

Geers Post Master General

icycalm Post Master General

Zainny Active Member

Sorian Official PA

ef32 Well-Known Member

Zainny Active Member

Sorian Official PA

mabdeno Active Member

doud Well-Known Member

doud Well-Known Member

superouman Post Master General

perfectdark Active Member

brianpurkiss Post Master General

draiwn New Member

exterminans Post Master General

maxpowerz Post Master General

maxpowerz Post Master General

nawrot Active Member

silenceoftheclams Active Member

Share This Page

Useful Searches