I did a bit of WebRTC development for video chat, and the state machine for that is one of the most complicated I've ever dealt with. Even household name-brand commercial providers don't handle all of the edge cases. It took me about a week to get it right. Session negotiation gone wrong can easily cause audio to be heard before the call is established (and this will certainly happen with the naive implementation -- even Google's own reference implementation had issues).
If you're curious, check out this flowchart slide from a Google I/O WebRTC talk:
If you're curious, check out this flowchart slide from a Google I/O WebRTC talk:
https://image.slidesharecdn.com/2014q2-geekandkranky-scalabi...