作者: 耗子007
所有命令均基于docker1.11版本
运行时特权和Linux能力
run命令与特权相关的几个选项:
- –cap-add: Add Linux capabilities
- –cap-drop: Drop Linux capabilities
- –privileged=false: Give extended privileges to this container
- –device=[]: Allows you to run devices inside the container without the –privileged flag.
注:默认的容器都是unprivileged,因此很多系统调用、设备等特权相关的操作都是干不了的。而且1.10版本之后增加了seccomp安全控制,
可能导致容器有了权限,但是一些系统调用被安全策略禁止了。关于安全策略seccomp可以参考文档Docker安全策略
docker run –privileged启动的容器,具有访问host上所有设备的能力,当然需要同时设置AppArmor或者SELinux允许容器相同的权限。
–device可以指定一个或者多个设备在容器中能正常使用,默认情况,容器对这些设备具有read、write和mknod的权限,这些权限可以通过”:rwm”来修改。
mknod可以参考这篇博客
示例如下:
1 | $ docker run --device=/dev/sda:/dev/xvdc --rm -it ubuntu fdisk /dev/xvdc |
相对于privileged的暴力权限,cap-add和cap-drop对权限的控制更细腻。
Docker目前支持的权限设置列表如下:
Capability Key | Capability Description |
---|---|
SETPCAP | Modify process capabilities. |
SYS_MODULE | Load and unload kernel modules. |
SYS_RAWIO | Perform I/O port operations (iopl(2) and ioperm(2)). |
SYS_PACCT | Use acct(2), switch process accounting on or off. |
SYS_ADMIN | Perform a range of system administration operations. |
SYS_NICE | Raise process nice value (nice(2), setpriority(2)) and change the nice value for arbitrary processes. |
SYS_RESOURCE | Override resource Limits. |
SYS_TIME | Set system clock (settimeofday(2), stime(2), adjtimex(2)); set real-time (hardware) clock. |
SYS_TTY_CONFIG | Use vhangup(2); employ various privileged ioctl(2) operations on virtual terminals. |
MKNOD | Create special files using mknod(2). |
AUDIT_WRITE | Write records to kernel auditing log. |
AUDIT_CONTROL | Enable and disable kernel auditing; change auditing filter rules; retrieve auditing status and filtering rules. |
MAC_OVERRIDE | Allow MAC configuration or state changes. Implemented for the Smack LSM. |
MAC_ADMIN | Override Mandatory Access Control (MAC). Implemented for the Smack Linux Security Module (LSM). |
NET_ADMIN | Perform various network-related operations. |
SYSLOG | Perform privileged syslog(2) operations. |
CHOWN | Make arbitrary changes to file UIDs and GIDs (see chown(2)). |
NET_RAW | Use RAW and PACKET sockets. |
DAC_OVERRIDE | Bypass file read, write, and execute permission checks. |
FOWNER | Bypass permission checks on operations that normally require the file system UID of the process to match the UID of the file. |
DAC_READ_SEARCH | Bypass file read permission checks and directory read and execute permission checks. |
FSETID | Don’t clear set-user-ID and set-group-ID permission bits when a file is modified. |
KILL | Bypass permission checks for sending signals. |
SETGID | Make arbitrary manipulations of process GIDs and supplementary GID list. |
SETUID | Make arbitrary manipulations of process UIDs. |
LINUX_ IMMUTABLE | Set the FS_APPEND_FL and FS_IMMUTABLE_FL i-node flags. |
NET_BIND_SERVICE | Bind a socket to internet domain privileged ports (port numbers less than 1024). |
NET_BROADCAST | Make socket broadcasts, and listen to multicasts. |
IPC_LOCK | Lock memory (mlock(2), mlockall(2), mmap(2), shmctl(2)). |
IPC_OWNER | Bypass permission checks for operations on System V IPC objects. |
SYS_CHROOT | Use chroot(2), change root directory. |
SYS_PTRACE | Trace arbitrary processes using ptrace(2). |
SYS_ BOOT | Use reboot(2) and kexec_load(2), reboot and load a new kernel for later execution. |
LEASE | Establish leases on arbitrary files (see fcntl(2)). |
SETFCAP | Set file capabilities. |
WAKE_ALARM | Trigger something that will wake up the system. |
BLOCK_SUSPEND | Employ features that can block system suspend. |
cap-add和cap-drop都支持ALL,表示加上或者去掉所有能力。例如,加上所有能力,但是去掉MKNOD:
1 | $ docker run --cap-add=ALL --cap-drop=MKNOD ... |