push-to-talk on linux
Intro
Regularly doing online meetings and letting everyone listen in on your family drama because you forgot to mute the call is almost a rite of passage for remote workers. Sadly too few devices come with a hardware microphone mute button. So let’s hack together a software push-to-talk for Linux!
By the way, you may find some alternative or better push-to-talk solutions for linux from a web search, but this is my own procedure that I’ve been using happily. Your mileage may vary.
Requirements
You are going to need these:
- xbindkeys
- xset
- setxkbmap
- pulseaudio or pipewire
- an X11 desktop environment
- a keyboard key to sacrifice
We have to hijack a keyboard key for our push-to-talk functionality — let’s pick on the CapsLock key. Install the necessary programs from your distro’s package repositories. Don’t know how? just search for “install toolname in distro”!
the script
We will use xbindkeys to attach scripts for toggling microphone mute state to CapsLock’s key press and key release events. You need to run a command that can identify microphone input sources, then mute and unmute them. You can tinker with the outputs of various arguments for pactl
to figure out a working command pipeline, or you can use this over-engineered mute-control script I’ve written for myself.
This can do more than just mute microphones, but for our purposes, mute-control.sh in mute --all
and mute-control.sh in unmute --all
are the two commands we need to use.
Save this script somewhere in $PATH, such as ~/.local/share/bin/mute-control.sh
, and make it executable:
chmod 755 path/to/mute-control.sh
attaching to the key
This is simple. Create or update ~/xbindkeysrc.scm
1 with this content:
; ~/.xbindkeysrc.scm ========================================================= ;
; xbindkeys scheme configuration file ;
; overrides ~/.xbindkeysrc (if exists) ;
;==============================================================================;
; push-to-talk with Caps_Lock
(begin
(xbindkey '(Caps_Lock)
"mute-control.sh in unmute --all"
)
(xbindkey '(Release Caps_Lock)
"mute-control.sh in mute --all"
)
)
Great! Now add xbindkeys -p -n
to your desktop environment’s startup commands. You can run xbindkeys -p -n
in a terminal right now to test your very own PTT. Isn’t this cool?
Not quite.
problem #1: caps-lock state toggle
You have a small problem now: pressing and releasing CapsLock does toggle microphone muting2, but it also toggles caps lock status, so everything you type next will turn into all-caps. The work-around for this is a little involved. First, let’s convert CapsLock to the Compose key, with setxkbmap:
setxkbmap -option 'caps:none,compose:caps'
You should make that
setxkbmap
command permanent. Your desktop’s keyboard settings might already have a UI for setting this option, otherwise you can add it as a startup command.Or, add a line to
~/.xprofile
like this, for example:setxkbmap -layout us -option 'caps:none,compose:caps'
Alternatively, run
localectl
like this (again, this is just an example, do not run blindly):localectl set-x11-keymap us,bd '' ',probhat' 'caps:none,compose:caps,grp:win_space_toggle'
Then replace Caps_Lock
with Multi_Key
in xbindkeys config:
; ~/.xbindkeysrc.scm ========================================================= ;
; xbindkeys scheme configuration file ;
; overrides ~/.xbindkeysrc (if exists) ;
;==============================================================================;
; push-to-talk with Multi_key
(begin
(xbindkey '(Multi_key)
"mute-control in unmute --all"
)
(xbindkey '(Release Multi_key)
"mute-control in mute --all"
)
)
Now it should work fine, except…
problem #2: key repeating
if you press and hold the key, mic toggle probably goes insane, rapidly muting and unmuting forever. I guess this is because CapsLock is categorized as a modifier key and does not trigger key repeat; so when you press and hold it, only a single key press event is generated, and another single release event is generated when you release the key — but Multi_key/Compose is not considered a modifier key, so it has key repeat by default; meaning a stream of key press and key release events are generated as long as you hold this key.
To work around this, we need support from xset:
$ xset -r Multi_key
Instead of manually running this command every time, you should add this as a startup command in your desktop environment. Or in fact, we can start it from xbindkeys config directly:
; ~/.xbindkeysrc.scm ========================================================= ;
; xbindkeys scheme configuration file ;
; overrides ~/.xbindkeysrc (if exists) ;
;==============================================================================;
; push-to-talk with Multi_key
(begin
(run-command "xset -r Multi_key > /dev/null")
(xbindkey '(Multi_key)
"mute-control in unmute --all"
)
(xbindkey '(Release Multi_key)
"mute-control in mute --all"
)
)
Done!
By the way, your CapsLock has bonus features now: you can hold the key for push-to-talk, and you can press Shift+CapsLock to use it as the Compose key.
This article was motivated by The Penguins Club BlogTalk event. #blogtalk #penguinsclub
- xbindkeys supports plain-text based configuration as
~/.xbindkeysrc
, and guile scheme script config as~/.xbindkeys.scm
. The scheme config is far more expressive with support for programmable constructs like loops, functions etc. Isn’t this impressive?↩ - if it doesn’t, then I guess you have two problems 🤷♂️. Let’s closely review everything from the start.↩
Join the discussion