[Unity (C #) x PUN2] How to implement voice chat in Oculus Quest

3 minute read

voice chat

There are various libraries for implementing voice chat,
I decided to use Photon, which seems to have a lot of information in major areas.

However, ** Article on how to implement Photon2 voice chat **
I couldn’t find it.

Can I afford to read the official documentation? I thought,
It was troublesome to test whether it was actually connected, so I took a lot of time, so I will make a note of it.


The first is the introduction of ** Asset called Photon Voice 2.
Import from the asset store. ** **

** Also import PUN2. ** **

Next you need to create an application ID.
After creating an account, the type ** PhotonVoice ** will be as shown below.
Create an application.


Create Photon application


Set the application ID of ** Photon Server Settings ** in the following path.

Enter the application ID of the type ** PhotonVoice ** you created earlier.

Next, prepare the necessary components for Hierarchy.
Requires ** PhotonVoiceNetwork ** and ** Recorder **.

Attach it to an appropriate object and
There is almost no problem if the parts surrounded by the red frame match.

Generate via Photon Network at the place where Prefab called Avatar is attached
Avatar (sync object) is set.

Avatar also needs to be set, so let’s take a look.

Requires ** Speaker ** and ** PhotonVoiceView **.
AudioSource will be added automatically.


Set the part surrounded by the red frame.
** SpeakerInUse attaches from its own object **.

This completes the approximate settings.

Reflect voice in mouth movement, communication synchronization

The avatar used this time is the one in the image below.

The sphere is the head and the mouth is an independent object placed in the child hierarchy of the head.

I borrowed it from a library for VR / AR communication synchronization implementation called Normcore and.

This time, I would like to make this ** mouth object snappy according to the voice **.

Originally, the code that makes the mouth crisp in Normcore’s library
Since it is published in the document, I will use it for PUN2.

Normcore documentation code

using System.Collections;
using System.Collections.Generic;
using UnityEngine;
using Normal.Realtime;

public class MouthMove : MonoBehaviour {
    public Transform mouth;

    private RealtimeAvatarVoice _voice;
    private float _mouthSize;

    void Awake() {
        // Get a reference to the RealtimeAvatarVoice component
        _voice = GetComponent<RealtimeAvatarVoice>();

    void Update() {
        // Use the current voice volume (a value between 0 - 1) to calculate the target mouth size (between 0.1 and 1.0)
        float targetMouthSize = Mathf.Lerp(0.1f, 1.0f, _voice.voiceVolume);

        // Animate the mouth size towards the target mouth size to keep the open / close animation smooth
        _mouthSize = Mathf.Lerp(_mouthSize, targetMouthSize, 30.0f * Time.deltaTime);

        // Apply the mouth size to the scale of the mouth geometry
        Vector3 localScale = mouth.localScale;
        localScale.y = _mouthSize;
        mouth.localScale = localScale;


The full code is below.

using Photon.Pun;
using Photon.Voice.PUN;
using UniRx;
using UniRx.Triggers;
using UnityEngine;

/// <summary>
///A function that moves the mouth when talking
/// </summary>
public class MouthSyncVoice : MonoBehaviourPun
    [SerializeField] private Transform _mouth;

    private PhotonVoiceView _voice;
    private float _mouthSize;

    void Start() 
        if (photonView.IsMine)
            _voice = GetComponent<PhotonVoiceView>();
            _voice.RecorderInUse.TransmitEnabled = true;
                .Subscribe(_ =>
                    //Smoothly move the Y-axis scale of the mouth object with Lerp
                    float targetMouthSize = Mathf.Lerp(0.1f, 1.0f,100 * _voice.RecorderInUse.LevelMeter.CurrentAvgAmp);
                    _mouthSize = Mathf.Lerp(_mouthSize, targetMouthSize, 30.0f * Time.deltaTime);
                    //Synchronous communication of mouth movements

    /// <summary>
    ///Send changes in mouth movement
    /// </summary>
    /// <param name="mouthSize">Mouth size</param>
    private void SyncMouth(float mouthSize)
        Vector3 localScale = _mouth.localScale;
        localScale.y = mouthSize;
        _mouth.localScale = localScale;

_voice.RecorderInUse.TransmitEnabled = true;
Voice exchange has started at the point.

The waveform of the average sound for the last 0.5 seconds is acquired by.

This waveform is used to change the scale of the mouth object in the Y-axis direction.

In the final SyncMouth method, changes in the size of the mouth object are synchronized on both clients.


Since it is a GIF, there is no sound, but the sound of the other party is heard from the HMD
I was able to confirm that my mouth was crisp.



Connected, not connected, or the audio is extremely quiet
Since it was covered with defects, I will add it as soon as the cause is known.