How-to integrate stereo and voice in one Bluetooth device
Via Eetasia.com
How to integrate stereo and voice in one Bluetooth device
By Yogesh Kamat Mhamai and Bikash Chowdhury, Impuslesoft.
Bluetooth stereo is forecast to become the fastest growing Bluetooth application-and the second largest. As a result, designers are bringing Bluetooth and music-on-the-go together with Bluetooth accessories, primarily stereo headphones for streaming music.
This is easier said than done. Multiple challenges must be addressed in order to deliver Bluetooth devices with stereo and music with a compelling user experience. This article discusses these challenges in details and suggests ways of overcoming them.
Bluetooth music-things are bright, but…
In the first quarter of 2005, virtually all of the top mobile phone vendors had either announced or launched mobile phones that support music-Nokia announced the N91 model in April, Sony Ericsson launched the Walkman handset at CeBIT in March, Samsung launched the SGHi300 at the same event while Motorola has been making headlines with the iTunes version of its mobile phone.
The good news is tempered by the fact that adoption of Bluetooth stereo headphones has not kept pace with the growth of mobile music.
The reasons are not hard to grasp. With Bluetooth established as the de facto short range wireless standard for mobile phones and the increasing popularity of Bluetooth mono headset, users do not want to carry two Bluetooth audio devices-one for voice (mobile phones) and another for stereo (music player).
Some definitions-to avoid confusion
Bluetooth mono (voice) headsets have been shipping for a while now and have attained a high level of maturity. A Bluetooth mono headset is the small Bluetooth devices that people use for handsfree calls from mobile phones. A Bluetooth stereo headphone allows users to listen to ONLY stereo music over Bluetooth.
A Bluetooth stereo headset, allows users to take handsfree call as well as listen to stereo music. Switchover refers to the seamless transition in a stereo headset from a music streaming state to a voice-call state and the subsequent return to music streaming once the call is over.
Bluetooth stereo headset-know the product
Multiple user scenarios relate to Bluetooth stereo and Bluetooth voice. Most of them can be reduced to the following:
* Music player-Mobile phone connectivity
* Multimedia phone connectivity
Let us look at these two user scenarios in greater detail.
Music player-Mobile phone connectivity
A user is listening to streaming music using her Bluetooth stereo headset. She gets a call on her mobile phone. Music pauses automatically and she hears ring tones on the headset; she decides to take the call. As soon as she hangs up, music resumes from where it had paused.
The above user scenario would most likely require the Bluetooth stereo headset being the common device in two piconets. The mobile phone will be the master in one of the piconets, and the music player will be the slave in the other piconet.
The Bluetooth stereo headset itself would be the master in one piconet and slave in another, resulting in a scatternet. In some cases, it is possible that scatternet may not be created, in which case the Bluetooth stereo headset will be the master.
Fig. 1. Example of a scatternet.
Fig. 2. Example of a piconet
Multimedia phone connectivity
A user is enjoying wireless music using his latest Bluetooth mobile phone that supports music. His phone shows an incoming call; music pauses on the phone, and consequently, on the Bluetooth stereo headset. He takes the call using the Talk button on the Bluetooth stereo headset. As soon as he hangs up, music resumes automatically from where it paused. This creates a simple piconet with two devices.
Both the applications look deceptively simple-pause the music, take the call and resume music once the call terminates. In reality it is not as simple as it sounds. Iintegrating voice and stereo capability in a Bluetooth headset while delivering a simple and intuitive user experience is fraught with challenges. Such challenges can be categorized as:
1. Technical: These relate to Bluetooth specification that either does not adequately address the issue or is ambiguous. For example, it is impossible to establish an HV1 SCO link for taking a call when an ACL link for streaming music already exists. The HV1 packet type will need the entire bandwidth.
2. Implementation:These relate to problems arising out of interpretation of the specification and subsequent implementation of the mandatory or optional features in a profile. For example, many mobile phones establish a SCO connection every time a key is pressed on the mobile phone. This creates a frustrating user experience when streaming music needs to be paused every few milliseconds so that the key press on the mobile phone can be rendered on the Bluetooth stereo headset.
3. Absence of guidelines – These relate to [1] and [2] above. Since the specification may be inadequate or ambiguous at times and a large number of companies are rushing to implement them in their products, a set of guidelines for co-existence of Bluetooth voice and music is urgently needed. Only recently have the Bluetooth SIG and companies realized the tremendous opportunity that Bluetooth stereo headset applications present. Multiple initiatives are underway to address the issue of co-existence and interoperability for Bluetooth stereo and voice features.
Let’s look at each of the above challenges in detail.
It’s not in the technology
The Bluetooth stereo headset is the center of the integrated universe. In the case of music player – mobile phone connectivity, the Bluetooth stereo headset is the connecting link between the mobile phone and the music player; in the case of multimedia phone connectivity, even though the mobile phone may be aware of both the links-SCO and ACL-the Bluetooth stereo headset has a critical role to play in managing the switchover.
As discussed earlier, the user experience involves pausing the music, playing the ring tones on the headset and depending on user’s preference, dropping or taking the call, and finally resuming music once the call is complete. Bluetooth stereo headphones have handled the pause and resume sequence, in isolation. Similarly, the voice sequence has been handled well, in isolation, by Bluetooth mono headsets. The challenge comes in putting these features together.
To begin with, the semantics of Audio-Video Remote Control Protocol (AVRCP) commands Pause and Playy are not tightly linked with Advanced Audio Distribution Profile (A2DP) connection. Currently, one of the following three options are adopted by music players incorporating Bluetooth stereo in their system, when they receive a Pause command:
1. Suspend / Start
2. Disconnect / Connect
3. Streaming silence
Suspend / Start is the ideal implementation for the AVRCP Pause and Play command. This is shown in Figure 3.
Fig 3. AVRCP Pause / Play implemented using A2DP Suspend / Start.
Unfortunately, Suspend / Start is an optional command in A2DP and not implemented in a large number of solutions. This leads to a situation where one of the following is used as a workaround.
Disconnect / Connect is a mandatory command and does not suffer from the issues faced by the optional Suspend / Start command. This option is shown in Figure 4.
Fig 4. AVRCP Pause / Play implemented using A2DP Disconnect / Connect.
This approach suffers from two major drawbacks. The semantics of AVRCP Pause / Play do not exactly correspond to Disconnect / Connect. It also results in higher latency due to protocol negotiation for reconnection – all codec parameters are negotiated afresh, as though it is a new connection.
Streaming silence is another approach that can be adopted to implement the semantics of AVRCP Pause / Play. When the Bluetooth stereo headset invokes the AVRCP Pause command, the Bluetooth music play can start streaming silence so that to the end user music appears to have been paused. This option is shown in Figure 5.
Figure 5. AVRCP Pause / Play implemented using Streaming Silence.
This, in reality, is a simulation of AVRCP Pause / Play. This might be a workable alternative, when Suspend / Start is not implemented, and the latency associated with Disconnect / Connect may be too high for an acceptable user experience.
It is important to note that independent of the method adopted to implement the semantics of the AVRCP Pause / Play, the end user may not experience true Pause / Play behavior ie. music resumes from exactly where it had stopped, unless the Bluetooth AV subsystem has:\
1. A digital interface to the music player, and
2. A programmatic interface for controlling the state of the music player
All this leads to the following conclusion. There is no consistent approach adopted by Bluetooth music players for addressing the AVRCP Pause / Play semantics The lack of agreed upon guidelines increases the design and implementation complexity for the Bluetooth stereo headset.
It’s in the implementation
Let us now comprehend the other piece of the puzzle-the mobile phones. In the mono world, mobile phone vendors adopted a simple approach of maximizing voice quality and justifiably so. Today there are more than 100 million phones with Bluetooth support.
Unfortunately, they vary widely in their Bluetooth voice implementation. For example, some mobile phones require ACL + SCO connection to be established upon call arrival, while some require the ACL link to be always on, and the SCO link to be established only on call arrival; others choose to have the SCO link always on.
In addition, the SCO packet type (HV1, HV2, HV3) supported by the mobile phone can vary between vendors as well as between models from the same vendor. This is shown in Figure 6.
Figure 6. Scenarios to be addressed for Bluetooth voice depending on mobile phone.
Figures 7 and 8 show the sequence chart for ACL link on call arrival and ACL link always on.
Figure 7. ACL link on call arrival.
Figure 8. ACL link always on.
Real life configurations for the music player and mobile phone that the Bluetooth stereo headset designer needs to address stem largely from implementation decisions taken by either or both the music player and mobile phone vendor.
Imagine a case where AVRCP Pause / Play is implemented using A2DP Suspend / Start, and the mobile phone has the ACL Link Always On (SCO Link on Call Arrival) and supports only HV1 packet.
If the Bluetooth stereo headset was streaming music at say 350 kbps bit rate, any request for SCO connection with HV1 packet type will be rejected by the Bluetooth baseband in the Bluetooth stereo headset.
This is because HV1 packet type requires the entire bandwidth – HV1 packet is a single slot packet type carrying 10 bytes of information and needs to be transmitted once in every two time slots; the piconet master needs 1 slot to send the HV1 packet and the next slot to receive the HV1 packet – effectively taking up the entire bandwidth.
In the presence of an existing ACL link streaming at 350 kbps (which in an ideal world will eventually go to Suspend mode upon Call Arrival), the Bluetooth baseband in the Bluetooth stereo headset knows it cannot meet the bandwidth requirement of a simultaneous SCO link with HV1 packet type and rejects the request for SCO connection. This is obviously a big problem for mobile phones that support only HV1 packet type.
Figure 9 illustrates the sequence chart for HV1 SCO packet type connection + suspend / start.
Figure 9. Sequence chart for HV1 SCO packet type connection + suspend / start.
Imagine another situation where the Bluetooth mobile phone supports HV3 packet type. The Bluetooth baseband on the Bluetooth stereo headset will accept the incoming connection-the piconet master requires 1 slot to send an HV3 packet, the next slot to receive the HV3 packet; the next 4 slots and hence bandwidth, is available before the master needs to send the next HV3 packet.
However, if the Bluetooth music player has implemented the AVRCP Pause / Play using the Streaming Silence mechanism, the Bluetooth stereo headset will have problem managing an ACL link that is streaming data (silence) and a SCO link simultaneously.
Figure 10. Sequence chart for HV3 SCO packet type + streaming silence.
Another problem caused by less than ideal implementation is the handling of button press. There are instances when the mobile phone streams beeps for any button press on the SCO link when music is streaming on the ACL link. The user experience is unpleasant. Whenever a key is pressed, a beep is played out on the Bluetooth stereo headset. This requires pausing the music, switching to the SCO link, playing the beep, resuming the music only to interrupt it again the next time a key is pressed on the mobile phone. A workaround is to ask the consumer to disable the key pad beeps on the mobile phone. This is both inelegant and expensive. It requires support calls, documentation and user education and is therefore unacceptable.
Figure 11. Sequence chart for SCO Beeps on mobile phone.
There are multiple configurations that a Bluetooth stereo headphone and a Bluetooth mono headset need to take care of independently. When both the Bluetooth stereo and Bluetooth mono features need to come together, these configurations multiply leading to increased complexity of the Bluetooth stereo headset design.
Resolving basic options
In the past, mobile phone vendors have made product choices that were justified in the isolated context of the mobile phone use case. One example is opting for HV1 packet type for the best quality audio as compared to HV3 packet type support.
Similarly, Bluetooth music player vendors have made choices that work well in a narrow use case-for example using Streaming Silence for AVRCP Pause / Play. The limitations imposed by such choices are proving to be very expensive in the world of stereo-mono co-existence. The recent formation of AV-HFP working group within the Bluetooth SIG is anticipated to help address this very issue.
Other challenges
There are a host of other systemic issues that need to be addressed in designing a Bluetooth stereo headset. For the sake of completeness, we will briefly touch upon these.
* Managing MIPS requirement in single chip solutions during switchover. The MIPS load may shoot up to 90 to 95 percent of the available horsepower in single chip solutions as most Bluetooth basebands are designed optimally for cost. Care must be taken to handle this peak MIPS load.
* Managing jitter during switchover. While jitter is always a major challenge in wireless streaming, it becomes particularly severe during switchover. Coupled with the peak MIPS load, this becomes a sticky problem to solve.
* Latency. Latency needs to be as small as possible so as to present a near-wired user experience. There are two components to this – call arrival and resumption of music after call is over. In the first case, if latency is high, the call might be dropped, while in the latter, user might feel that music is being lost during switchover. In either case, user experience suffers.
Conclusion
The requirement of Bluetooth music and voice to co-exist brings to the fore technical and non-technical challenges. The good news is that these challenges are not insurmountable because they relate largely to implementation choices made by individual companies and absence of a cohesive guideline.
These challenges have been recognized and are being addressed. Through co-ordination at the consortium level (Bluetooth SIG) as well as by individual companies, it is possible to convert the potential into a technical and commercial success story.
About the authors
Yogesh Kamat Mhamai is a Senior Software Engineer at Impulsesoft. Yogesh has been involved in developing one of the world’s first Bluetooth stereo headset solutions. He can be reached at yogesh@impulsesoft.com.
Bikash Chowdhury is a Program Manager, Wireless Stereo Program at Impulsesoft. He can be reached at bikash@impulsesoft.com.

October 9th, 2008 at 12:27 am
Why doesnt Wireless Stereo support two-way audio communication (i.e. a microphone)? Then the audio call could be streamed over the Wireless Stereo profile, eliminating the complexities in the switchover.
The protocol standards over-complicate the situation, and do not allow a handsfree device to function as it should — as if it were wired.