Jump to content
IGNORED

Article: The Value Proposition In Audio: Voice Control For Audiophiles (You Can't Buy Too Soon - You Can Only Sell Too Late)


Recommended Posts

1 hour ago, The Computer Audiophile said:

This is fantastic @bluesman

 

I've been following josh.ai for a while now. I think josh is the company to watch in the high end space for sure. 

Josh is cool, for sure - but they seem to be using current methods and tools to achieve something for which current methods and tools are not ideally suited.  Unless they're the ones to come out of the garage or basement with the next big thing in AI coding and output modalities, they'll be looked on as crude when someone else finally succeeds.  I know of no current platform that integrates the various functions necessary to achieve accurate and efficient voice control over devices in disparate systems, including what I know of Josh (which admittedly isn't a lot at the design level).

 

Right now, there are too many data bouncing 'way too far over 'way too many jury-rigged networks to do this smoothly.  And platform integration is not in the cards for an industry that profits largely from differentiation, so there's not likely to be one approach shared by all. This is a lot like the world of electronic medical records.  The most they hope for is "interoperability" - and that has us unable to share data universally across all healthcare institutions plus payers and the scientific community.  All Epic users can share their data if they wish, as can users of several other major EMR platforms.  But these programs aren't written in the same languages and they run on different architectures.  So if your hospital is on Epic and the one where you ended up unconscious in the ER because you fell off the train is on Cerner, you're out of luck unless you carry your medical records around on a USB drive or a CD.

 

It's a lot like needing home hubs for Z-wave, Zigbee, Google Home, Samsung Smart Things, SmartLife, and Apple Home because you have a few devices that work on each platform.  There are several smart speakers that require you to control some audio functions from Alexa and some from Google Assistant - this is no way to run a railroad.  As I was just saying to my watch, "Siri, tell Alexa to open House Band; Siri, tell Alexa to tell House Band to tell JRiver to play music by Wayne Henderson in the master bedroom; Siri, tell Alexa to make the music louder;"

Link to comment

Good article. I use five Harmony Hubs (only $70 each at Amazon) at home, each tied to a different Gmail and Amazon account. This enables me to voice control one of the most annoying functions for my wife and visitors:  how to turn on a system and switch it in a particular room for watching TV, or listening to 2-channel or multichannel music. There is essentially no audiophile product I’ve purchased that this setup can’t simplify because Harmony has such a deep database of electronics products.  The separate accounts allow asking Alexa or Google to turn on stereo, for example, and it will only impact the system in that particular room without turning on stereos throughout the house. 
 

I tried HouseBand with JRiver early on and found it frustratingly difficult to get it to work. Sounds like I should give it another go, perhaps tied to my Apple Watch. JCR 

Link to comment
39 minutes ago, jrobbins50 said:

Good article. I use five Harmony Hubs (only $70 each at Amazon) at home, each tied to a different Gmail and Amazon account.

Thanks!
 

Your willingness to use multiple accounts and devices to “integrate” functions says that you’re flexible and adventurous, like me.  But we’re in the minority by far - most people would think we’re a bit daft to go that far.......and we shouldn’t have to.  I believe it won’t be long before there are more universal platforms and approaches available to us.  But until then, let’s stretch the envelope to see how much it can hold 😁

Link to comment

Very professional article! Piqued my interest so did a bit of Googling:

 

System

Platform/Hardware

Notes

aido

Robot with GUI

cameras, multiple CPU's and GPU's, not available yet, price?

athena

 

Open source software project written in Python

bixby

Samsung mobile devices

Samsung's Google assistant

hound

Automotive

 

jibo

Robot for healthcare and education

 

josh

 

interfaces in posh homes to Lutron lighting, Sonos, Crestron thermostat, home security, smart TV's, home theatre

mycroft

Runs on Raspberry Pi etc.

Open source software project (Python?), Mk I was $180 now sold out, Mk II coming soon

ubi ucic

Runs on Android and Linux

Ubi Kit is free for developers supports Google assistant and Alexa

viv.ai

 

Viv is an artificial intelligence platform - intelligent personal assistant software created by the developers of Siri, bought by Samsung. Now to be integrated into Bixby 2.0

🎸🎶🏔️🐺

Link to comment
3 hours ago, blue2 said:

Very professional article! Piqued my interest so did a bit of Googling:

 

System

Platform/Hardware

Notes

aido

Robot with GUI

cameras, multiple CPU's and GPU's, not available yet, price?

athena

 

Open source software project written in Python

bixby

Samsung mobile devices

Samsung's Google assistant

hound

Automotive

 

jibo

Robot for healthcare and education

 

josh

 

interfaces in posh homes to Lutron lighting, Sonos, Crestron thermostat, home security, smart TV's, home theatre

mycroft

Runs on Raspberry Pi etc.

Open source software project (Python?), Mk I was $180 now sold out, Mk II coming soon

ubi ucic

Runs on Android and Linux

Ubi Kit is free for developers supports Google assistant and Alexa

viv.ai

 

Viv is an artificial intelligence platform - intelligent personal assistant software created by the developers of Siri, bought by Samsung. Now to be integrated into Bixby 2.0

Thanks for your time & comments!  There's so much potential here that I'm amazed the audio industry hasn't recognized how important VC & AI are for development and sales of future products.  We all experience mechanical controller failures in everything we use, from audio to cars to coffee machines.  Tiny touch screens, bubble switches, touch sensitive controls etc are only pseudoelectronic - they still have physical parts that fail too often.  We could eliminate most of those last century pieces and concepts by integrating excellent voice recognition and synthesis with AI. Imagine no more noisy pots, no cracked or dented bubble switches, no broken or lost knobs, minimal internal wiring, etc.

 

Then imagine being able to control and monitor every audio parameter of interest to us in real time using voice input and synthesized voice response.  Throw in AI's ability to monitor real time performance and identify impending failures by detecting as yet inaudible changes in everything from voltage & current stability at various points to distortion to early ID of asymmetry in channel outputs.  In addition to telling your system what you want to hear (and how and where and when...), you could ask for a status check and get a verbal response plus a downloadable log report.  You could set up spontaneous verbal warnings when voltage, temperature, and other metrics go out of spec.  You could alter or switch amplifier operating characteristics to A-B changes in SQ.

 

How about a status report when powering up, e.g.  "Good morning, Bob - your system is in perfect operating condition and ready to play"?  Get vocal alerts as needed - "THD in your left channel has increased to 105% of the right channel.  Diagnostics show early failure of V4 with no other abnormality.  Replace tube."

 

This is cool stuff!  I can't wait to play with it all as it develops.

Link to comment
25 minutes ago, bluesman said:

Thanks for your time & comments!  There's so much potential here that I'm amazed the audio industry hasn't recognized how important VC & AI are for development and sales of future products.  We all experience mechanical controller failures in everything we use, from audio to cars to coffee machines.  Tiny touch screens, bubble switches, touch sensitive controls etc are only pseudoelectronic - they still have physical parts that fail too often.  We could eliminate most of those last century pieces and concepts by integrating excellent voice recognition and synthesis with AI. Imagine no more noisy pots, no cracked or dented bubble switches, no broken or lost knobs, minimal internal wiring, etc.

 

Then imagine being able to control and monitor every audio parameter of interest to us in real time using voice input and synthesized voice response.  Throw in AI's ability to monitor real time performance and identify impending failures by detecting as yet inaudible changes in everything from voltage & current stability at various points to distortion to early ID of asymmetry in channel outputs.  In addition to telling your system what you want to hear (and how and where and when...), you could ask for a status check and get a verbal response plus a downloadable log report.  You could set up spontaneous verbal warnings when voltage, temperature, and other metrics go out of spec.  You could alter or switch amplifier operating characteristics to A-B changes in SQ.

 

How about a status report when powering up, e.g.  "Good morning, Bob - your system is in perfect operating condition and ready to play"?  Get vocal alerts as needed - "THD in your left channel has increased to 105% of the right channel.  Diagnostics show early failure of V4 with no other abnormality.  Replace tube."

 

This is cool stuff!  I can't wait to play with it all as it develops.

Agree 100%
 

It’s all cool stuff and it’s helpful (system status alerts etc...). Getting cool and helpful, not gimmicky is key. 
 

I’d love to change a setting without navigating an endless menu that I haven’t used in 6 months. VC makes it easy. 

Founder of Audiophile Style | My Audio Systems AudiophileStyleStickerWhite2.0.png AudiophileStyleStickerWhite7.1.4.png

Link to comment
Quote

Google devices will play 24/96 FLACs, and their high end devices have embedded Chromecast so they can be used as DLNA zones in Jriver.  If you want to have voice control over JRiver playing through a Google device as a zone, you’ll have to add the 

The last sentence needs an ending.

 

Note that you don't need JRiver to use DLNA with Chromecast Audio. I use my CCA devices with two QNAP controllers (Music Station and QMusic), and with BubbleUPNP. I believe there are others too, maybe MConnect and Kazoo?). No voice control though, which I don't care about. 

 

Main System: QNAP TS-451+ NAS > Silent Angel Bonn N8 > Sonore opticalModule Deluxe v2 > Corning SMF with Finisar FTLF1318P3BTL SFPs > Uptone EtherREGEN > exaSound PlayPoint and e32 Mk-II DAC > Meitner MTR-101 Plus monoblocks > Bamberg S5-MTM sealed standmount speakers. 

Crown XLi 1500 powering  AV123 Rocket UFW10 stereo subwoofers

Upgraded power on all switches, renderer and DAC. 

 

Link to comment
1 hour ago, audiobomber said:

The last sentence needs an ending.

 

Note that you don't need JRiver to use DLNA with Chromecast Audio. I use my CCA devices with two QNAP controllers (Music Station and QMusic), and with BubbleUPNP. I believe there are others too, maybe MConnect and Kazoo?). No voice control though, which I don't care about. 

 

Whoops!  Chris was having some problems with the formatting of the original document I sent (a conversion from odt to docx).  When I converted it to a pdf for him, I must have converted the wrong draft.  Here's what it should have said:

 

"If you want to have voice control over JRiver playing through a Google device as a zone, you’ll have to use Alexa. If she’s not sharing a device with the GA, you can link her from an Amazon device or a third party host using an app like Helea Smart."  [Chris, if you can drop this in, it will save others the irritation of the typo.]

 

I'm sorry if I gave the erroneous impression that you had to use JRMC in order to cast to CCAs in general.  My point was that if you use Google smart speakers but want to have voice control over JRMC playing to them, you have to use Alexa either from an Alexa-enabled device or with a 3rd party integration app. Google speakers will show up as DLNA zones in JRMC if you have BubbleUPnP etc running along with JRMC, so there's some functional integration there among JRMC, Alexa and the GA (albeit crude integration).  But it's not ideal, and it's one reason we chose Amazon / Alexa for our primary smart platform.

Link to comment

For an audiophile system with voice control, can't you use a Bluesound Node 2i with digital output to an external DAC which feeds your stereo?  You can then use Google Assistant or Alexa to choose a song to play on the Node 2i.  I would assume that Tidal Connect to the Node 2i could be controlled with Google Assistant and Alexa as well.

Link to comment
1 hour ago, palpatine242 said:

For an audiophile system with voice control, can't you use a Bluesound Node 2i with digital output to an external DAC which feeds your stereo?  You can then use Google Assistant or Alexa to choose a song to play on the Node 2i.  I would assume that Tidal Connect to the Node 2i could be controlled with Google Assistant and Alexa as well.

Yes you can.  There are several streamers like this, but a comparison was far beyond the scope of this article. Yamaha makes 2 models with which I'm familiar, Denon has the Heos system and devices, etc.  Voice control in all of these is limited by what Alexa, GA etc can do.

 

From the Bluesound website, "...you can use voice commands to play saved playlists, select your favorite radio station, adjust volume levels, or even group Players together".  The Node 2i will respond to Alexa using the Bluesound skill and to Google Assistant using middleware called Blue Voice. As long as you've set up everything correctly from the DAC downstream, you can control the stated functions from the Node - but you can't control any other element in your system.  There are enough downsides to this to deter me from using it.

 

For example, your power amp gain control has to be set high enough to encompass the loudest playback you'll ever use, if Alexa's controlling your system volume at the streamer.  This leaves your speakers vulnerable to any transients generated in your front end but not attenuated by the variable gain stage, e.g. at turn-on and turn-off or when switching sources.  If you have an uncontrollable pop with any function, your voice controller can't cut the gain before executing it.

 

The BS Node is marketed to "...instantly [breathe] new life into your decades-old stereo equipment" by adding network and web streaming sources. The only analog inputs I see are in the combo 3.5mm optical / line level jack, which must be how one connects a turntable or CD player.  I can't tell if you can stream the output over the LAN or WLAN (I assume not) but the 2 way BT should let you drive BT speakers with any source.  The USB input is only for storage devices - you can't connect a USB turntable or other real time USB source.  There are many limitations to this approach to voice control, although it does work within the limits of system and technological constraints.  Stay tuned - it will get better!

Link to comment

Excellent article, bluesman.  Voice control in audio for the general public will probably have a better chance at success then for audiophiles.  We're just to dam picky. Sure I'd like to say "play music," then the stereo turns on and starts playing music were it left off.  I can currently do that buy pressing play from my PC, Ipad or iphone using KEF's LS50Wireless speakers via Roon.  No power on or off needed for those.  But when it comes to the nitty gritty stuff, I want to search through Roon to find what I want to listen to at that moment.  Sure I could say "play such and such," but as an audiophile, I'll want a specific version of a song, in a certain format and then that will dictate what I want to hear next.  So for simple tasks maybe it'll be ok, but for me I'd rather doing things from my ipad or PC.  Status updates or changes to your system would deiffenitly be a cool feature though.

 

Using Comcast's remote (or voice control for any TV/Cable/Video in general) makes more since since your watching something that might take from 30 minutes to 3 hours to watch.  Pick up the remote say what you want and sit back.  It's still faster for me to punch in a 3 digit channel, hit enter (ok button) then it's done, then it is to bring the remote to your mouth, press the voice control button, say what you want to watch (about 70% correct for me) and wait for the channel to change.  

 

The other thing for me, it would seem weird to say things out loud while listening to music.  If I'm rocking out, I'd have to turn down the volume or yell into the room to change the song being played or add something to the queue.  If I'm with guests, the last thing I wan to do is tell them what their going to hear next by saying it.  Some of the nostalgia of listening to music with my friends is playing something that will surprise them, and see there reaction. 

 

Just my 2 cents,

Shawn

Computer setup - Roon/Qobuz - PS Audio P5 Regenerator - HIFI Rose 250A Streamer - Emotiva XPA-2 Harbeth P3ESR XD - Rel  R-528 Sub

Comfy Chair - Schitt Jotunheim - Meze Audio Empyrean w/Mitch Barnett's Accurate Sound FilterSet

Link to comment

A great article.  

 

I have voice control in my car and on my  Xfinity remote control.  Seldom use the features.  I was going to buy a faraday (sp?) envelope for the remote control.  But since the TV knows more about you than the IRS, I decided it was not worth the expense. 

 

I never have understood why people would put active espionage equipment in their home.  Alexa, I am looking at you!

 

I have tape over my computer camera.  I never use location services unless I am lost.  I have every possible option turned off on both phone and camera.  No Facebook (pure evil), no Twitter, no social media at all.  BTW, I do not consider Audiophile Style to be social media.  Audiophile Style in an information source and you are all my friends, right?

 

I know all of the above is worthless, but if I can irritate some random data collection AI somewhere for at least 3 nano seconds it is worth it.  Oh, and I log into Rolls Royce.com at least 3 times a day just to screw up the ad tracker on my browser......

 

Regards.  George Orwell

 

 

In any dispute the intensity of feeling is inversely proportional to the value of the issues at stake ~ Sayre's Law

Link to comment
5 hours ago, NOMBEDES said:

A great article.  

 

I have voice control in my car and on my  Xfinity remote control.  Seldom use the features.  I was going to buy a faraday (sp?) envelope for the remote control.  But since the TV knows more about you than the IRS, I decided it was not worth the expense. 

 

I never have understood why people would put active espionage equipment in their home.  Alexa, I am looking at you!

 

I have tape over my computer camera.  I never use location services unless I am lost.  I have every possible option turned off on both phone and camera.  No Facebook (pure evil), no Twitter, no social media at all.  BTW, I do not consider Audiophile Style to be social media.  Audiophile Style in an information source and you are all my friends, right?

 

I know all of the above is worthless, but if I can irritate some random data collection AI somewhere for at least 3 nano seconds it is worth it.  Oh, and I log into Rolls Royce.com at least 3 times a day just to screw up the ad tracker on my browser......

 

Regards.  George Orwell

 

Great post. 🙂

 

I'm also not interested in voice control. My Mac Mini doesn't have a microphone and that is the only thing I have connected to the internet.

 

I also don't use any social media, no facebook and no twitter, etc.

 

My HDTV is not connected to the internet, I use an indoor antenna which picks up 33 stations in my city.

 

My only phone is a corded landline.

 

I only use prepaid non-reloadable debit cards to purchase stuff on the internet.

 

When my 8-year old computer dies, I'm not replacing it and I will go back to checking emails and Audiophile Style from the library once a week. The library offers free use for 30 minutes. At that point I will have to go back purchasing everything in person or old fashion mail order. At that point my apartment will not be connected to anything on-line or in the cloud.

I have dementia. I save all my posts in a text file I call Forums.  I do a search in that file to find out what I said or did in the past.

 

I still love music.

 

Teresa

Link to comment

Personally, I use voice control all of the time.  I focus on using 'Lady A' (a.k.a., Alexa) for simple on/off tasks.  I found that using the Logitech Harmony remote/hub was the best method to control multiple devices.  I use the simplest version; the 'Harmony Companion.'  If the device or component has a remote, then Harmony can control it.  And, there is a Harmony skill for Alexa.  So, turn on TV, mute TV, etc. are natural commands.  Likewise, in the audio room, mute music is a fantastic voice command.  And, if you have smart lights (I use Philips Hue) then turning the lights on/off or just right for music is (again) fantastic.

 

Personally, I do not need or want to control every button, knob or setting through voice control; unlikely that I would ever remember all the commands.  Then I would need the Cliff Note for voice control :-P   

Link to comment

All I wish is for Siri to work with Roon. Though I have examples of Alexa and Amazon's assistant (gifts), I never plug them in because I don't want them listening to my life 24/7.   Google, Facecrook are nothing but data rapers I care not to align with them.   I have been an Apple customer since the first iPhone...I think Apple is the least intrusive into my life via Siri.

 

I user an iPad for Roon Control, and it works seamlessly...I'm already spoiled ...Roon with Voice control would just be icing on the music listening cake! 

Link to comment
33 minutes ago, LarryMagoo said:

Roon with Voice control would just be icing on the music listening cake

The new crop of VC/AI programs will almost certainly be able to control Roon.  What's needed is the ability to send the same call to the processors in response to a voice command that's generated by clicking on a Roon icon. It's not as simple as that sounds, but it's doable today.  I've been playing with this using Braina, but I've not yet succeeded.

 

Siri is almost certainly capable of doing this now - there are many custom business applications out there resulting from licensed use of Siri technology.  But I suspect that Apple's not about to support any platforms at their own expense that don't augment their revenue stream beyond their costs.

Link to comment

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now



×
×
  • Create New...