Dirty Clone and SE Linux
There is a new Linux kernel exploit out named Dirty Clone [1].
The first thing to do to exploit this is to create a container with a separate network namespace via one of the following commands:
unshare -Urn bwrap --bind / / --unshare-user --unshare-net --uid 0 --gid 0 /bin/bash
The Jfrog people recommend “unshare -Urn” but I gave the Bubblewrap command as an option as it should work equally well and in some situations may be permitted when unshare isn’t.
The next step to exploiting it is to use the ip command to set the links up, below is what happens in a user session on a SE Linux system with user_t as the login domain:
# ip link set lo up RTNETLINK answers: Operation not permitted
That will give an entry in /var/log/audit/audit.log like the following:
type=AVC msg=audit(1782818856.618:3610): avc: denied { net_admin } for pid=1829 comm="ip" capability=12 scontext=user_u:user_r:user_t:s0 tcontext=user_u:user_r:user_t:s0 tclass=cap_userns permissive=0
type=SYSCALL msg=audit(1782818856.618:3610): arch=c000003e syscall=46 success=yes exit=32 a0=3 a1=7ffebe5f9e50 a2=0 a3=0 items=0 ppid=1638 pid=1829 auid=0 uid=0 gid=1000 euid=0 suid=0 fsuid=0 egid=1000 sgid=1000 fsgid=1000 tty=pts0 ses=17 comm="ip" exe="/usr/bin/ip" subj=user_u:user_r:user_t:s0 key=(null)ARCH=x86_64 SYSCALL=sendmsg AUID="root" UID="root" GID="test" EUID="root" SUID="root" FSUID="root" EGID="test" SGID="test" FSGID="test"
type=PROCTITLE msg=audit(1782818856.618:3610): proctitle=6970006C696E6B00736574006C6F007570
Unlike previous exploits like Pintheft [2] this doesn’t require any really uncommon access to the kernel (unless you consider setting up IPSec to be really uncommon) and is allowed in many container setups.
Now on a system with the unconfined module removed (as described in the SE Linux Protection part of my post about Copy Fail [3]) the following domains have such access:
# sesearch -A -c cap_userns -p net_admin
allow container_engine_t container_engine_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow container_init_t container_init_t:cap_userns { chown dac_override dac_read_search fowner kill net_admin net_bind_service net_raw setgid setuid };
allow container_kvm_t container_kvm_t:cap_userns { chown dac_override dac_read_search fowner kill net_admin net_bind_service net_raw setgid setuid };
allow container_t container_t:cap_userns { chown dac_override dac_read_search fowner kill net_admin net_bind_service net_raw setgid setuid };
allow crio_t crio_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow dockerd_t dockerd_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow dockerd_user_t dockerd_user_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow init_t init_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_module sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow iptables_t iptables_t:cap_userns { net_admin net_raw };
allow podman_t podman_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow podman_user_t podman_user_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock ipc_owner kill lease linux_immutable mknod net_admin net_bind_service net_raw setfcap setgid setpcap setuid sys_admin sys_boot sys_chroot sys_nice sys_pacct sys_ptrace sys_rawio sys_resource sys_time sys_tty_config };
allow spc_t spc_t:cap_userns { audit_write chown dac_override dac_read_search fowner fsetid ipc_lock kill mknod net_admin net_bind_service net_raw setgid setpcap setuid sys_admin sys_chroot sys_nice sys_ptrace sys_rawio sys_resource };
allow spc_user_t spc_user_t:cap_userns { chown dac_override dac_read_search fowner kill net_admin net_bind_service net_raw setgid setuid };
allow staff_bubblewrap_t staff_bubblewrap_t:cap_userns { dac_override net_admin setpcap sys_admin sys_ptrace };
allow sysadm_bubblewrap_t sysadm_bubblewrap_t:cap_userns { dac_override net_admin setpcap sys_admin sys_ptrace };
allow user_bubblewrap_t user_bubblewrap_t:cap_userns { dac_override net_admin setpcap sys_admin sys_ptrace };
Conclusion
It seems that SE Linux configured in the strict mode prevents this exploit in the most obvious use case. But with the range of container related domains that are granted such access it seems quite likely that some configurations and use cases will permit it.
Overall the protection that the standard policy for SE Linux can offer (in a non-default configuration) against net_admin access isn’t bad, but isn’t very good either.
I think this will be the first of many exploits based on cap_userns access and that we need to do some work in tightening the SE Linux access controls on such things. One possible way of doing this is to have a program run inside a container in a domain that has permissions such as net_admin to setup the container and not allow domain transitions from the regular programs run in the container (the actual work) to the domain used for network setup.
The increasing use of containers by applications is only going to make this problem worse. I think that what we need is something like Flatpak for the vast majority of desktop/phone applications with a container setup program that works with apps packaged in the distribution packaging method (not from Flathub). This is something I’m going to investigate for future blog posts.