Sunday livestream

Sorian · February 19, 2015

exterminans said: ↑

sorian said: ↑

Not sure what you mean by enemy of my enemy. The AI sees all non-allied units as enemies. Since the neural nets are trained in 1v1 scenarios, there is no chance of non-allied army intervention screwing with the training.
Click to expand...

What I mean, is when you have tripplets such as "Fighters", "Tanks" and "enemy mobile AA" while evaluating the tank platoon. Tank and fighters don't share anything when regarding their own stats, but is is necessary for the tank to hunt down the AA because it proposes an immediate threat towards the fighter platoon. If there were no endangered fighters, the enemy AA was much less of a threat and wouldn't be worth putting the tanks in a possible danger.

Relations like this exist between all layers, it's a generic pattern.

You might have noticed that the AI went for towers first when attacking the enemy base, even when only AA towers were left. But it did it unconditionally so far, as detecting need for assistance from the humongous number of inputs isn't trivial. So it's far more efficient to aid the neural net by providing better filtered inputs.
Click to expand...

Ok, there are two reasons why I think that is an incorrect assumption.

1) During training the platoons only care about the units in their own platoon. While this can cause some fuzziness when there are allied units around, this should be an edge case given the shear amount of training.
2) Attacking anti-surface defenses and attacking other (non anti-surface) defenses are two separate outputs, which means they should learn differently.

If after more intense training I still see weird behaviors like you mention I will re-evaluate the inputs and adjust them.

exterminans · February 19, 2015

sorian said: ↑

1) During training the platoons only care about the units in their own platoon. While this can cause some fuzziness when there are allied units around, this should be an edge case given the shear amount of training.
Click to expand...

That's the IMHO biggest weakness of the AI. We both know that it does a great job at micro managing platoons and taking the action which makes best use of an individual platoon during encounters in the open field, when each side occupies only a single movement layer. But it completely fails in all these "edge cases" during an attack on a base where a simple joint attack could have caused far more damage with less metal investment.

sorian said: ↑

2) Attacking anti-surface defenses and attacking other (non anti-surface) defenses are two separate outputs, which means they should learn differently.
Click to expand...

Ok, that even means it didn't just happen by chance. During the last match, the ground force in the final assault even prioritized the AA tower over the laser tower (and all factories) with not allied air around.

And I don't think that the net will learn that relation. It has probably seen the tower as a target without risk and a cheap kill, and maybe as a generic threat. But the way the training is set up, it can't possibly have grasped the relation between not killing AA and a FUTURE loss of an air platoon.

EDIT:
Unless...
What would happen if the training did also apply retroactively to the platoon which have previously encountered the enemy?
E.g. if platoon A retreats against platoon C, and platoon B later on (within a range of 10-30 seconds) gets killed by platoon C, then A gets rewarded for not dying, but also punished for the death of B.

That way, the net could be trained to understand to deal respect to the consequences when disregarding allies.

Sorian · February 19, 2015

exterminans said: ↑

Ok, that even means it didn't just happen by chance. During the last match, the ground force in the final assault even prioritized the AA tower over the laser tower (and all factories) with not allied air around.

And I don't think that the net will learn that relation. It has probably seen the tower as a target without risk and a cheap kill, and maybe as a generic threat. But the way the training is set up, it can't possibly have grasped the relation between not killing AA and a FUTURE loss of an air platoon.
Click to expand...

That is true, it wont learn about future losses. That isn't the intended use of the neural net. The intended use is the tactical decision making of a single platoon, not as an overall strategic decision making tool. Making a neural network that controlled the strategic decision making processes of the AI would be a huge undertaking.

exterminans · February 19, 2015

sorian said: ↑

Making a neural network that controlled the strategic decision making processes of the AI would be a huge undertaking.
Click to expand...

Long term strategy: Yes, that's out of scope.

Short term however should be possible with surprisingly little effort.

During training, just remember for 5-20 seconds were a platoon was, and who it encountered (not necessarily who it fought, but who was in proximity). With a lower weight, scaled by time passed, also apply training to it when a different platoon has an encounter at either the same position or with the same opponents.

That way, the platoon is actually rewarded for saving, or punished for causing the loss of another platoon which should cause the network to make assertions about upcoming encounters.

Sorian · February 19, 2015

exterminans said: ↑

sorian said: ↑

Making a neural network that controlled the strategic decision making processes of the AI would be a huge undertaking.
Click to expand...

Long term strategy: Yes, that's out of scope.

Short term however should be possible with surprisingly little effort.

During training, just remember for 5-20 seconds were a platoon was, and who it encountered (not necessarily who it fought, but who was in proximity). With a lower weight, scaled by time passed, also apply training to it when a different platoon has an encounter at either the same position or with the same opponents.

That way, the platoon is actually rewarded for saving, or punished for causing the loss of another platoon which should cause the network to make assertions about upcoming encounters.
Click to expand...

That would take a surprising amount of work, actually.

[Edit] You did give me an idea on how I can condense the inputs, however. Well, maybe.[/Edit]

[Edit 2] By the way, in case it isn't clear from my replies, I am enjoying this conversation @exterminans. I don't get to talk neural networks often because I am the only person I know really using them in games. [/Edit 2]

exterminans · February 19, 2015

Yes, it wouldn't be for free and I didn't meant to say it was easy, but it appears cheaper than all the alternatives. Well, that is if there were any other options of making platoons aware of short term strategic options short of writing an entirely new AI for that task.

Well, one more point on the ever growing "things to try when I have nothing better to do" list.

EDIT: The Chronocam does offer that log if I'm not mistaken. It's basically a look backwards whenever you encounter an event which causes training. Yielding additional events during the simulation could aid that, at least for encounters, so no additional parsing of the game state is required.

crizmess · February 19, 2015

exterminans said: ↑

During training, just remember for 5-20 seconds were a platoon was, and who it encountered (not necessarily who it fought, but who was in proximity). With a lower weight, scaled by time passed, also apply training to it when a different platoon has an encounter at either the same position or with the same opponents.
Click to expand...

Sounds a bit like temporal difference learning.

exterminans said: ↑

That way, the platoon is actually rewarded for saving, or punished for causing the loss of another platoon which should cause the network to make assertions about upcoming encounters.
Click to expand...

But keep in mind that, since a platoon doesn't have any information of what other platoons are around, this will converge to a static set of probabilities of what kind of platoons may encounter an enemy within the next seconds.

exterminans · February 20, 2015

@crizmess Thank you so much! I had no idea how it is called, as I just made up what appeared logic to me, based on what I saw in PA. So inputs which evaluate other allied platoons are actually necessary, good to have that as a backed fact as well.

PS: Whoa, the research on that topic is about 30 years old by now

@sorian That 2010 paper about applying that approach to board games looks very promising: http://www.cs.bris.ac.uk/Publications/Papers/2000100.pdf

someonewhoisnobody · February 20, 2015

Just wondering, what is the "zu" namespace you keep using? My best guess is that it is a custom math library that Uber has made.

Also what build system are you guys using? You said it was some Python system so I assume it isn't CMake. Maybe GYP? Or some in house magic?

Sorian · February 20, 2015

someonewhoisnobody said: ↑

Just wondering, what is the "zu" namespace you keep using? My best guess is that it is a custom math library that Uber has made.

Also what build system are you guys using? You said it was some Python system so I assume it isn't CMake. Maybe GYP? Or some in house magic?
Click to expand...

zu is our low level engine stuff. crom is on top of that, and the engine sits on top of that. They are just namespaces to differentiate the different layers.

Our python build stuff is some other set of magic.

stevenrs11 · February 20, 2015

nixtempestas said: ↑

One thing I've been noticing for awhile is the AI not really using infernos very well when the AI seems to be trying to decide whether to engage or run. I suspect this is the result of incomplete intel, first it thinks it should engage then it sees more units and decides no it should run, but then the front line of units dies in the enemy force and changes its mind again. A smart thing to do here I think would be to split the platoon, where units like the infernos would then try to rush towards the enemy lines while the ranged units sat back and blasted away. (the infernos would gather valuable intel in the process)

Having split platoon as an output option might be feasible, and what units should split off ought to be a trainable behavior. Or the effect might be trivial and I have no idea what I'm talking about...
Click to expand...

I just played a game that has a very good example of this, in combination with another problem I think.

There was a choke point between a lava lake and a mountain that the AI was constantly sending units through, so I built a single line of walls with two double laser turrets behind them. Intelligently (or by chance not sure), the AI more or less just avoided the choke.

Then I built a pelter behind them. Now I'm not sure if this was just chance or the AI said to itself, "Oh crap pelter KILL IT NOW", but the AI started attacking those walls like nuts. This is where the problem started.

It would send its mixed army of bolos and infernos at the walls, just into range of its tanks. They would fire a shot or two, then retreat, taking losses in the process. Over and over again. If at any time it had actually committed and got the infernos into range of the walls, it would have wrecked the defensive line. With the AI's two combat fabbers, it would have probably lost only two of its 5 infernos, and none of its tanks. If it had kept the combat fabbers in range well, it might not have lost any units at all.

It got worse when the AI sent a second platoon into the choke. Each was moving in opposite directions, at one point trapping the majority of its infernos (moving away from the wall) in range of the turrets and preventing the tanks (moving towards the wall) from entering their firing range.

In the end, these two double barreled turrets behind walls where firing continuously for 56 seconds against a vastly superior force that should have totally wiped them out.

Here is a video of the relevant bits- link (when it uploads)

Also, I wanted to give you the lobby ID/replay ID, but the game didnt show up in my recent games.

Sorian · February 20, 2015

stevenrs11 said: ↑

Wall of stuff
Click to expand...

I think I have a fix for the back and forth thing. Should be fixed further once the new neural networks are done.

stevenrs11 · February 20, 2015

I intended to post the lobby ID as well, so sorry for the uninformative (but cinematic) camera angle.

crizmess · February 20, 2015

exterminans said: ↑

PS: Whoa, the research on that topic is about 30 years old by now
Click to expand...

Yes, TD learning is really old. I never seriously worked on machine learning, so take all my talk as what they are, mostly ramblings.
TD learning is the dual-form of Q learning (that means both solve the same problem, but using perpendicular approaches) - by the way, Q learning is from 1989. If you look at their update rules it is obvious that both are direct "consequences" of the markov decision process (plus some magic to actually proof the convergence, which is actually the important part).
That's the reason why the popped up so early. Once most of the math about markov chains were in place, it just was a matter of time.
And they still being used, because they are really fundamental and universal, once you have a (hidden) markov model - and they are almost everywhere - you can use those to approximate it. The math behind it is really simple, the update function is really easy to be understood.

BTW: @sorian the AI can used a markov model to approximate the damage graph to learn about unit compositions. It is really easy (more or less ) I'll write something up if you're interested.

Sunday livestream

Sorian Official PA

exterminans Post Master General

Sorian Official PA

exterminans Post Master General

Sorian Official PA

exterminans Post Master General

crizmess Well-Known Member

exterminans Post Master General

someonewhoisnobody Well-Known Member

Sorian Official PA

stevenrs11 Active Member

Sorian Official PA

stevenrs11 Active Member

crizmess Well-Known Member

Share This Page

Sunday livestream

Sorian Official PA

exterminans Post Master General

Sorian Official PA

exterminans Post Master General

Sorian Official PA

exterminans Post Master General

crizmess Well-Known Member

exterminans Post Master General

someonewhoisnobody Well-Known Member

Sorian Official PA

stevenrs11 Active Member

Sorian Official PA

stevenrs11 Active Member

crizmess Well-Known Member

Share This Page

Useful Searches