Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature voice message as mp3 #4920

Merged
merged 24 commits into from
Apr 11, 2025
Merged

Feature voice message as mp3 #4920

merged 24 commits into from
Apr 11, 2025

Conversation

nicodh
Copy link
Member

@nicodh nicodh commented Apr 3, 2025

This is a workaround as long as iOS does not support the weba codec and as long as we don't have some audio converter in core.

@nicodh nicodh marked this pull request as draft April 3, 2025 12:38
nicodh added 2 commits April 3, 2025 14:39
 - uses MediaRecorder which is restricted to audio/webm or audio/mp4
   codecs

 based on pr #3456
The only codec supported natively in Chrome MediaRecorder weba (Web
Audio) which is not supported in iOS.

This PR adds a standalone AudioRecorder with a simple dependency to lame
@nicodh nicodh force-pushed the feature-voice-message-rebased branch from ea00cea to 2638fe1 Compare April 3, 2025 12:39
@nicodh nicodh force-pushed the feature-voice-message-rebased branch from 2638fe1 to 79bf7d2 Compare April 3, 2025 12:41
@nicodh nicodh changed the title Feature voice message rebased Feature voice message as mp3 Apr 3, 2025
Copy link
Collaborator

@WofWca WofWca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked thoroughly yet.

@nicodh nicodh marked this pull request as ready for review April 4, 2025 09:56
@r10s
Copy link
Member

r10s commented Apr 4, 2025

very nice to push that forward! tested a bit - voice recording and sending works already nicely, also for iOS - thanks a lot to give us more time to adapt iOS to ogg, and to remove the pressure 🙏

some UI things that come to my mind:

  • i like the idea of the pulsating record icon

  • the time display during recording should use the normal font (as used in message input) and there should not be spaces before/after colon

  • it is not so nice that on normal message input, on entering the first letter, that shifts to the left now (as the record button disappears). that appears restless and unstable.
    all of whatsapp/telegram/signal have the record button right of the input field - i think, we should go for that as well.
    this also fits nicely with the send button only shown when text is entered:
    we can have less moving parts now, by just replacing record-button by send-button - which both should have the exact same size

  • the pulsating audio and the time display during recording can stay where they are, on the left

  • for the "stop" button is a bit unclear what happens - it is a "stages" things, but that is a bit hard to explain. maybe just call it "ok" - and add a "cancel" button at the same time

  • i personally like the staging - though all of whatsapp/telegram/signal send without directly without staging and without the option to re-hear, never really got that part

Copy link
Member

@r10s r10s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice!

let's keep "audio controls" and "message sending controls" together, that means, move the buttons right, and the level meter to time and pulsating mic:

Screenshot 2025-04-07 at 13 28 52

(the "green" looks aggressive on the screenshot, but i think, it is okay, as it flickers fast anyways)

@r10s
Copy link
Member

r10s commented Apr 7, 2025

ah, another things that needs to be targeted before it can be released:

you can switch to another chat while recording, and the bar will stay as is. the message then is sent to whatever chat was selected last, this is quite unexpected ...

i suggest to stop recording and stage the recording when switching chats (this is what eg. signal is doing)

Comment on lines +23 to +24
sampleRate: 44100,
bitRate: 128,
Copy link
Member

@r10s r10s Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this results in unexpectedly high data usage. until we can switch to opus, i suggest to half both parameters, for voice that is usually good enough

Suggested change
sampleRate: 44100,
bitRate: 128,
sampleRate: 22050,
bitRate: 64,

just a rough estimation, for sampleRate, i am not 100% sure, maybe that's unrelated to file size, we should try out (older considerations)

moreover, we should respect the "media_quality" setting, but that can be done in another PR to not complicate things here further

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're into downsampling, for speech 8000 would be enough to make it understandable, but yeah, that is gonna be phone-level quality.

https://en.wikipedia.org/wiki/Sampling_(signal_processing)#Speech_sampling

For reference, in Telegram a one-minute voice message takes up ~230KB.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I played around with various values but it didn't had any impact on the size of the recorded audio. It's always ~484kB for 30 seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, then there is a bug somewhere, maybe in the used library, ~1 mb/minute is the standard rate mp3 has as 128 kbit, 44.1 kHz

i think, it should not block merging this PR, but we might want to file an issue for that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open a new issue when this PR is merged

@Simon-Laux
Copy link
Member

Simon-Laux commented Apr 8, 2025

About colors in the volume meter: many apps such as discord, mumble (in first setup), audacity and professional Digital Audio Workstations have different colors for ranges, like green, yellow and red (red is if you are too loud so that it clips/overloads).

Bildschirmaufnahme.2025-04-08.um.06.43.41.mov
Bildschirmaufnahme.2025-04-08.um.06.45.34.mov

Just an idea, I'm not saying that we need it, especially not now, I think it is fine as is (with button placement that r10s posted). maybe we find a more intuitive solution in the future like a waveform.

@nicodh nicodh force-pushed the feature-voice-message-rebased branch from 48a1709 to 610f668 Compare April 8, 2025 14:08
@nicodh
Copy link
Member Author

nicodh commented Apr 8, 2025

About colors in the volume meter: many apps such as discord, mumble (in first setup), audacity and professional Digital Audio Workstations have different colors for ranges, like green, yellow and red (red is if you are too loud so that it clips/overloads).

We have that too:
image

You can see it, if you put the audio input to a maximum in your system settings

Only question is: it's hard to detect the "real" limit where clipping starts...

I set the values here: https://github.com/deltachat/deltachat-desktop/pull/4920/files#diff-769f75b81cc687f60e9af48e627fdb8efe8608b32f185df559644c374358998dR131

@nicodh
Copy link
Member Author

nicodh commented Apr 8, 2025

ah, another things that needs to be targeted before it can be released:

you can switch to another chat while recording, and the bar will stay as is. the message then is sent to whatever chat was selected last, this is quite unexpected ...

i suggest to stop recording and stage the recording when switching chats (this is what eg. signal is doing)

I fixed it now lilke this: when you switch chats without stopping the recording it will be lost. If you stop recording before switching your record will be saved as draft. That should be sufficient for now.

"voice_send": {
"message": "Send a voice message"
},
"voice_send_cannot": {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the case.

Copy link
Collaborator

@WofWca WofWca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels pretty solid overall.
I left a couple more non-critical comments, but I won't insist on addressing them in this MR. I'll also create an issue with further improvement ideas.

IMO this can be merged at the current state.

Thanks for finally bringing such a demanded feature to life!

Co-authored-by: WofWca <wofwca@protonmail.com>
Co-authored-by: WofWca <wofwca@protonmail.com>
@nicodh nicodh merged commit d887c6f into main Apr 11, 2025
13 of 14 checks passed
@nicodh nicodh deleted the feature-voice-message-rebased branch April 11, 2025 14:37
@WofWca WofWca mentioned this pull request Apr 11, 2025
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants