I tried Unity ML-Agents

4 minute read



  • Unity 2019.4.9f1 (LTS)
  • ML Agents 1.3.0 (Preview)
  • ML Agents Extension 0.0.1 (Preview)
  • Anaconda Python 3.8
  • Windows 10

Next article

⇒ “Make a turn-based game with Unity ML-Agents

Article scope

――We will introduce the development environment and provide a basic understanding.
-Unity and C # will not be explained.
-About the concept of reinforcement learning using rewards and machine learning itself explains I’m sorry.
–Anaconda, Python, TensorFlow will not be mentioned.
–This article contains my interpretation. I would appreciate it if you could point out if it is different from the fact.



-Download Unity Hub to install.
–Open Unity Hub and install the new LTS version from Install.
–At first I tried 2018.4.x (LTS), but the editor crashed frequently, so I chose 2019.4.x (LTS).

-Download and install with the default settings.
――You will be asked what the path is, but you can just do it without changing anything.
–Start Anaconda Navigator and exit Navigator when the automatic settings are complete.
–For development, use ʻAnaconda Prompt` from the Start menu. (Windows)

-Install according to Guidance.
–Clone or download the repository.
――The whole is a development environment.
–Create a Unity project based on the template in the Project folder inside.
–The folder name (Project) can be changed. (If you create multiple machine learning projects as they are, the same names will be lined up on Unity Hub …)
–In the Unity package manager, install the required Unity packages.
–With pip3, install the required Python packages.
–Use ʻAnaconda Prompt` from the Start menu. (Windows)

Try an introductory guide

–First, try “Introduction Guide”.
――Let’s try re-learning.
–Use ʻAnaconda Prompt from the Start menu. (Windows) --First, move the current directory to the parent folder of your Unity project (that is, the root of the ML-Agents repository). --Usually for the first time learning - mlagents-learn config/ppo/3DBall.yaml –run-id=first3DBallRun --Resume from interruption - mlagents-learn config/ppo/3DBall.yaml –run-id=first3DBallRun –resume --Overwrite the previous file - mlagents-learn config/ppo/3DBall.yaml –run-id=first3DBallRun –force --When the ASCII art Unity logo is followed by Start training by pressing the Play button in the Unity Editor., press the play button in the editor. --Press Ctrl + C on the console (ʻAnaconda Prompt) to interrupt learning. Then the editor will stop playing automatically.
–Even if you stop playing the editor suddenly, it seems to be saved on the console side, but it seems that it is not recommended.
–Copy the learning result ~ .nn file from the Result folder to the ʻAsset / TFModels` folder.
–For the first time, set the file in the agent component.
–If you want to see more examples, go to Learning Environment Examples (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Examples.md) ..

Create a new learning environment

–Next, try “Create a new learning environment” (https://github.com/Unity-Technologies/ml-agents/blob/master/docs/Learning-Environment-Create-New.md).
–The agent looks like the one below.
–I used the training configuration file (~ config.yaml) as a sample.


using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Unity.MLAgents;
using Unity.MLAgents.Sensors;

public class RollerAgent : Agent {

	[SerializeField] private Transform target = default; //target
	[SerializeField] private float forceMultiplier = 10; //Coefficient of action
	private Rigidbody rBody; //Rigid body of the agent
	private const float reachMargin = 1.42f; //Allowable contact error

	/// <summary>Object initialization</summary>
	private void Awake () {
		rBody = GetComponent<Rigidbody>();

	/// <summary>Agent initialization and reset</summary>
	public override void OnEpisodeBegin () {
		if (transform.localPosition.y < 0) {
			//Initialize if agent is down
			rBody.angularVelocity = Vector3.zero;
			rBody.velocity = Vector3.zero;
			transform.localPosition = new Vector3 (0, 0.5f, 0);
		//Move target to new random position
		target.localPosition = new Vector3 (Random.value * 8 - 4, 0.5f, Random.value * 8 - 4);

	/// <summary>Observation of the environment</summary>
	public override void CollectObservations (VectorSensor sensor) {
		//Target and agent location
		sensor.AddObservation (target.localPosition);
		sensor.AddObservation (transform.localPosition);
		//Agent speed
		sensor.AddObservation (rBody.velocity.x);
		sensor.AddObservation (rBody.velocity.z);

	/// <summary>Action and reward allocation</summary>
	public override void OnActionReceived (float [] vectorAction) {
		//Action, size = 2
		var controlSignal = Vector3.zero;
		controlSignal.x = vectorAction [0];
		controlSignal.z = vectorAction [1];
		rBody.AddForce (controlSignal * forceMultiplier);
		var distanceToTarget = Vector3.Distance (transform.localPosition, target.localPosition);
		//Reaching the target
		if (distanceToTarget < reachMargin) {
			SetReward (1.0f);
			EndEpisode ();
		//Fall from platform
		if (transform.localPosition.y < 0) {
			EndEpisode ();

	/// <summary>Environmental testing</summary>
	public override void Heuristic (float [] actionsOut) {
		actionsOut [0] = Input.GetAxis ("Horizontal");
		actionsOut [1] = Input.GetAxis ("Vertical");


Challenges and coping

The following issues have arisen and have been addressed.

I am told that I cannot start even if I try to learn in an executable format on Windows

Arguments for mlagents-learn according to” Independent Executable Learning Environment “ I specified the ** folder ** that I built in --env = <env_name>, but I can’t start learning with “mlagents_envs.exception.UnityEnvironmentException ~ Couldn’t launch”.


You specified the full path (/ delimited) of the executable ** file ** in <env_name>.
It was okay with or without the extension.

Freeze with ʻEndEpisode ()`

The entire editor freezes.


ʻEndEpisode () internally calls ʻOnEpisodeBegin ().
Therefore, if you call ʻEndEpisode () in ʻOnEpisodeBegin (), reentry will occur.

Learn more so that it can be applied

――Each of the examples introduced is to learn the behavior in 3D space, but “Agent “and” Designing Learning Environment “ By learning, I was able to see the path to a turn-based game.
–By learning “Training Configuration File”, you can create your own ~ config. You will be able to create yaml files.
――By learning “Independent Executable Learning Environment” for a long time You will not have to occupy the editor when learning.
――It seems that you need to learn more about machine learning itself in order to optimize learning and get better results.