For localization, ref this. https://developers.google.com/actions/localization/action-packages
But even for me, many things are not documented by google. Sdk is still on beta stage.

For audio response, sometimes audio data comes too lately. Unfortunately there is no good way to measure whether audio is just late or broken or missed. Maybe I can add timeover and waiting response features on next version.(I’m building whole new v3. In a few weeks I can release it.)