Introduction
Seccomp, short for secure computing mode, is a Linux kernel feature that provides an additional layer of security by restricting the system calls available to a process. Seccomp can help prevent exploits by reducing the attack surface of a process and limiting the damage that can be caused by successful attacks. In this article, we will explore the basics of seccomp and how it can be used to enhance the security of Linux systems.
How Seccomp Works
Seccomp allows a process to specify a list of system calls that it needs to use, and blocks all other system calls. When a process tries to execute a blocked system call, the kernel terminates the process with SIGKILL. Seccomp can be implemented in one of two modes: strict mode and filter mode.
In strict mode, the process is only allowed to make a predefined set of system calls. Any attempt to make an unauthorized system call results in the process being terminated. Strict mode is useful for applications that have a known set of system call requirements and do not need to make any other system calls.
In filter mode, the process is allowed to make any system call, but the seccomp filter can be used to block specific system calls that are not needed by the process. This mode is useful for applications that require dynamic system call usage and cannot be easily restricted to a predefined set of system calls.
Implementing Seccomp Filters
Seccomp filters can be implemented using the libseccomp library, which provides a set of functions for working with seccomp filters. Filters are defined using a BPF (Berkeley Packet Filter) syntax, which allows for complex filtering rules to be defined in a compact format.
Here is an example of a simple seccomp filter that allows only a few system calls:
#include <seccomp.h>
int main() {
scmp_filter_ctx ctx;
ctx = seccomp_init(SCMP_ACT_ALLOW);
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(socket), 1,
SCMP_CMP(0, SCMP_CMP_EQ, AF_UNIX));
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(open), 1,
SCMP_CMP(0, SCMP_CMP_EQ, "/etc/passwd"));
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(write), 0);
seccomp_load(ctx);
return 0;
}
This filter allows the socket system call with the first argument equal to AF_UNIX, the open system call with the first argument equal to “/etc/passwd”, and the write system call with any arguments. All other system calls are blocked.
Seccomp in Containers
Seccomp is especially useful for containerized applications, where the attack surface is even greater due to the shared kernel between containers. By using seccomp filters, containers can restrict the system calls available to the containerized application, reducing the risk of exploits and breaches.
Docker and other container runtimes provide options for enabling seccomp filters. Here is an example of how to enable a seccomp profile in Docker:
version: '3'
services:
myservice:
image: myimage
security_opt:
- seccomp: ./myseccomp.json
This YAML configuration file specifies that the container should use the seccomp profile defined in the myseccomp.json file.
Common Seccomp Filters
Seccomp filters can be used to restrict any system call, but there are a few common system calls that are often blocked in seccomp filters to enhance security. Here are a few examples:
execve
The execve system call is used to execute a new program. This system call is often blocked in seccomp filters because it can be used to execute arbitrary code and bypass other security mechanisms. In many cases, a process only needs to execute a few predefined programs, so blocking the execve system call can limit the risk of code execution attacks.
Here is an example of a seccomp filter that blocks the execve system call:
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(execve), 0);
ptrace
The ptrace system call is used for process debugging and can be abused by attackers to gain access to other processes running on the system. This system call is often blocked in seccomp filters to prevent attackers from using it to elevate their privileges or access sensitive information.
Here is an example of a seccomp filter that blocks the ptrace system call:
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(ptrace), 0);
mount
The mount system call is used to mount filesystems, and can be used by attackers to gain elevated privileges or to mount malicious filesystems. This system call is often blocked in seccomp filters to prevent attackers from mounting unauthorized filesystems.
Here is an example of a seccomp filter that blocks the mount system call:
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(mount), 0);
socket
The socket system call is used to create network sockets, and can be used by attackers to establish network connections and exfiltrate data. This system call is often blocked in seccomp filters to prevent attackers from establishing network connections.
Here is an example of a seccomp filter that blocks the socket system call:
seccomp_rule_add(ctx, SCMP_ACT_KILL, SCMP_SYS(socket), 0);
Conclusion
Seccomp filters provide an additional layer of security to Linux systems by restricting the system calls available to a process. By blocking common system calls that can be used by attackers to gain access to sensitive information or execute arbitrary code, seccomp filters can limit the risk of successful attacks. Seccomp filters can be implemented using the libseccomp library and defined using a BPF syntax. By using seccomp filters, containerized applications can restrict the system calls available to the containerized application, reducing the risk of exploits and breaches.