So from what I know there was a slight change and regular rabbit is now beta rabbit. But now when I ask a question r1 keeps saying stuff like “Searching for thing”, “searching for other thing”, “searching for image of thing” etc.
I don’t really want to hear that “searching for” stuff out loud. r1 is supposed to be an assistant. If I had a human assistant I would not want them telling me out loud during every step of them completing a task like “I’m starting to drive to the dry cleaners”, “I’m pulling into the parking lot of the dry cleaners”, I would just want them to report back when the task is done like “i dropped off your clothes at the dry cleaners”.
It’s great to just visually flash that stuff on the screen as just text so that you know r1 is “thinking” and working on the task but listening to r1 say a bunch of stuff like “searching for thing” is too much unnecessary talking in my opinion.
Just let me hear the initial auditory cue like “got it!” or “working on it” etc. but no need for any more audio until the answer is ready. Before the answer is ready the only r1 output should be visual so that if you need to check if r1 is actually working on the task you can check by looking at r1 and seeing either the bouncing dots “…” or see visual only messages like “searching for thing”.
Since I, like my blind acquaintances, use Rabbit with closed eyes, I always appreciate acoustic feedback. It would be nice to have some sounds during the search that indicate R1 is doing something. Often, Beta Rabbit’s responses end with sources or a note that links have been stored in the Rabbithole. These texts are currently not being read aloud but would be insightful for 'listeners only
What I’m referring to is completely unnecessary audio that wasn’t happening before this recent cloud update.
This audio of “searching for thing”, “searching for thing”, “searching for image” “searching for other thing” “searching for image again” etc is not needed and is not the right approach in my opinion for indicating that r1 is “thinking” or working on a task.
I am very supportive of accessibility features but I would think that this repetitive unnecessary audio would be annoying to even those who are visually impaired.
What I would suggest instead would be an accessibility feature that could be toggled on and off in an accessibility section in the settings.
This accessibility feature could maybe play a simple melody or sounds that would then be associated with r1 “thinking” or working on a task.
This would be similar to how they play music when you are put on hold on a phone call. A simple melodic cue would be easy on the ears and on the mind and would create a nice contrast with the useful spoken words of an actual reply from r1. This would allow visually impaired users to have an auditory signal without having to hear these repetitive “searching for thing” sentences.
I will have situations all the time where my r1 will list off everything it’s doing, and it would take upwards of a minute and a half for it to finish saying everything. Then it just tells me that an error occurred, wasting my time. The model needs to use discernment when it comes to explaining what’s happening. Of course, if it’s going to take some time, it’d be nice to hear what it’s doing, but if it’s running basic searches, it should stay silent and just give us the answer as soon as possible.
Now I had the same experience. Rappidly repeated annoying “doing this” “Doing that”, stucking in a loop and ending with no answer. That will be fixed soon i hope.
I would even further that hearing “Got it!” and “I’m working on it” are unnecessary responses for every inquiry. It gets to be repetitive, time consuming and it takes away from the “real conversational” tone of the dialogue.
I did a full shutdown via settings and another full shutdown via 8x ptt. It made it a little better but the loops are still happening, just not as frequently.
I think r1 needs to be tweaked to not vocalize this unnecessary speech in general. The loops wouldn’t be as annoying if it was just silent words flashing on the screen until r1 is ready to give a final response.
A quick vocalization like “got it!” to let you know r1 has started, visual cues on screen to let you know r1 is “thinking”, then don’t speak again until r1 has something useful to say. And for visually impaired users have an accessibility option for some kind of simple melody or tone to indicate r1 “thinking”.
After the recent cloud change there is too much vocalized “play by play” of things that aren’t important to the user. Even when I save a note now there’s a bunch of extra unnecessary r1 talk and it wasn’t like that before.
Bit late to this thread but it would be great to have the option to turn off the ‘let me take a look’-type statements - perhaps the r1 acknowledgement could just be in text form, on screen (a bit like when Perplexity Pro analyses a query - it shows what it’s doing / thinking about on screen, which is cool).
Just as a point of reference on how this is perceived. My wife who does not use the device overheard an r1 response the other day, which started with 4-5 cycles of “searching for, searching for, searching for”. She kind of gave me a, “what the hell is going on” kind of look, which i agree with.
If i have a smart assistant, i want to pose a question, and get an answer. Now i don’t mind little signals or cues that the r1 is looking for an answer, but it should be more natural. Really for me, the … animation is sufficient.
Yeah the current r1 is talking way too much to the point where it feels more like a “debug” mode.
They need to minimize the amount of unnecessary audio.
Now when I save a note r1 repeats everything I said. It becomes tedious and (potentially embarrassing if other people are around) to have everything you just said repeated out loud by r1.
This to me feels like something you would enable to debug issues if you are experiencing r1 not saving data correctly. This is not a comfortable or natural way to interact with a device imo.
If they want to give users a way to know if the note was saved correctly, showing the text visually and silently on the screen is much better.
I just want r1 to say simple cues like “got it!” or “note saved!”.
If r1 gets “confused” then that would be acceptable too. R1 saying “i didn’t catch that, can you repeat that?” Or “i’m not sure what you mean by that?” Etc.
Questions like this would feel natural as this is how humans interact with each other. It doesn’t feel natural to say out loud exactly what your are doing at all times and it doesn’t feel natural to repeat everything a person told you unless you are specifically asked to do so.