- 13 Jun, 2019 3 commits
-
-
Tycho Andersen authored
We have a do_clone(), which just calls a void f(void *) that it gets passed. We build up a struct consisting of two args that are just the actual arg and actual function. Let's just have the syscall do this for us. Signed-off-by:Tycho Andersen <tycho@tycho.ws>
-
Tycho Andersen authored
We should add a little not about the race in the previous patch. Signed-off-by:Tycho Andersen <tycho@tycho.ws>
-
Tycho Andersen authored
There are two problems with this code: 1. The math is wrong. We allocate a char *foo[__LXC_STACK_SIZE]; which means it's really sizeof(char *) * __LXC_STACK_SIZE, instead of just __LXC_STACK SIZE. 2. We can't actually allocate it on our stack. When we use CLONE_VM (which we do in the shared ns case) that means that the new thread is just running one page lower on the stack, but anything that allocates a page on the stack may clobber data. This is a pretty short race window since we just do the shared ns stuff and then do a clone without CLONE_VM. However, it does point out an interesting possible privilege escalation if things aren't configured correctly: do_share_ns() sets up namespaces while it shares the address space of the task that spawned it; once it enters the pid ns of the thing it's sharing with, the thing it's sharing with can ptrace it and write stuff into the host's address space. Since the function that does the clone() is lxc_spawn(), it has a struct cgroup_ops* on the stack, which itself has function pointers called later in the function, so it's possible to allocate shellcode in the address space of the host and run it fairly easily. ASLR doesn't mitigate this since we know exactly the stack offsets; however this patch has the kernel allocate a new stack, which will help. Of course, the attacker could just check /proc/pid/maps to find the location of the stack, but they'd still have to guess where to write stuff in. The thing that does prevent this is the default configuration of apparmor. Since the apparmor profile is set in the second clone, and apparmor prevents ptracing things under a different profile, attackers confined by apparmor can't do this. However, if users are using a custom configuration with shared namespaces, care must be taken to avoid this race. Shared namespaces aren't widely used now, so perhaps this isn't a problem, but with the advent of crio-lxc for k8s, this functionality will be used more. Signed-off-by:Tycho Andersen <tycho@tycho.ws>
-
- 21 May, 2019 1 commit
-
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
- 18 May, 2019 36 commits
-
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Specifically, refloat function arguments and remove useless comments. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Let lxc_attach() reuse the already initialized container. Closes https://github.com/lxc/lxd/issues/5755. Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Thomas Parrott authored
Signed-off-by:Thomas Parrott <thomas.parrott@canonical.com>
-
Thomas Parrott authored
Updates lxc_restore_phys_nics_to_netns() to move phys netdevs back to the monitor's network namespace rather than the previously hardcoded PID 1 net ns. This is to fix instances where LXC is started inside a net ns different from PID 1 and physical devices are moved back to a different net ns when the container is shutdown than the net ns than where the container was started from. Signed-off-by:Thomas Parrott <thomas.parrott@canonical.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Rachid Koucha authored
Suppressed error prone semicolon in SYSTRACE() macro. Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Rachid Koucha authored
Use %m under HAVE_M_FORMAT instead of strerror() Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Rachid Koucha authored
GLIBC supports %m to avoid calling strerror(). Using it saves some code space. ==> This check will define HAVE_M_FORMAT to be use wherever possible (e.g. log.h) Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Rikard Falkeborn authored
Signed-off-by:Rikard Falkeborn <rikard.falkeborn@gmail.com>
-
Rikard Falkeborn authored
Returning -1 in a function with return type bool is the same as returning true. Change to return false to indicate error properly. Detected with cppcheck. Signed-off-by:Rikard Falkeborn <rikard.falkeborn@gmail.com>
-
Rikard Falkeborn authored
Returning -1 in a function with return type bool is the same as returning true. Change to return false to indicate error properly. Detected with cppcheck. Signed-off-by:Rikard Falkeborn <rikard.falkeborn@gmail.com>
-
Rikard Falkeborn authored
Since _exit() will terminate, the return statement is dead code. Also, returning -1 from a function with bool as return type is confusing. Detected with cppcheck. Signed-off-by:Rikard Falkeborn <rikard.falkeborn@gmail.com>
-
Radostin Stoyanov authored
CRIU has only 4 levels of verbosity (errors, warnings, info, debug). Thus, using `-v4` is more appropriate. https://criu.org/LoggingSigned-off-by:
Radostin Stoyanov <rstoyanov1@gmail.com>
-
Rachid Koucha authored
As suggested during the review. Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Rachid Koucha authored
lxc-ls without root privileges on privileged containers should not display information. In lxc_container_new(), ongoing_create()'s result is not checked for all possible returned values. Hence, an unprivileged user can send command messages to the container's monitor. For example: $ lxc-ls -P /.../tests -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED ctr - 0 - - - false $ sudo lxc-ls -P /.../tests -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED ctr RUNNING 0 - 10.0.3.51 - false After this change: $ lxc-ls -P /.../tests -f <-------- No more display without root privileges $ sudo lxc-ls -P /.../tests -f NAME STATE AUTOSTART GROUPS IPV4 IPV6 UNPRIVILEGED ctr RUNNING 0 - 10.0.3.37 - false $ Signed-off-by:
Rachid Koucha <rachid.koucha@gmail.com> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Rachid Koucha authored
. Add the "--bbpath" option to pass an alternate busybox pathname instead of the one found from ${PATH}. . Take this opportunity to add some formatting in the usage display . As a try is done to pick rootfs from the config file and set it to ${path}/rootfs, it is unnecessary to make it mandatory Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Rachid Koucha authored
Some error messages were not redirected to stderr. Moreover, do "exit 0" instead of "exit 1" when "help" option is passed. Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Christian Brauner authored
Use CLONE_PIDFD when possible. Note the clone() syscall ignores unknown flags which is usually a design mistake. However, for us this bug is a feature since we can just pass the flag along and see whether the kernel has given us a pidfd. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Thomas Parrott authored
The phys devices will now have their original MTUs recorded at start and restored at shutdown. This is to protect the original phys device from having any container level MTU customisation being applied to the device once it is restored to the host. Signed-off-by:Thomas Parrott <thomas.parrott@canonical.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Thomas Parrott authored
Signed-off-by:Thomas Parrott <thomas.parrott@canonical.com>
-
Christian Brauner authored
Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com> Co-developed-by:
David Howells <dhowells@redhat.com> Signed-off-by:
David Howells <dhowells@redhat.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Rachid Koucha authored
Added /dev in the mknod commands. Signed-off-by:Rachid Koucha <rachid.koucha@gmail.com>
-
Christian Brauner authored
Well, I added this syscall so we better use it. :) Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
The returns_twice attribute tells the compiler that a function may return more than one time. The compiler will ensure that all registers are dead before calling such a function and will emit a warning about the variables that may be clobbered after the second return from the function. Examples of such functions are setjmp and vfork. The longjmp-like counterpart of such function, if any, might need to be marked with the noreturn attribute. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Serge Hallyn authored
Signed-off-by:Serge Hallyn <shallyn@cisco.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
tomponline authored
Signed-off-by:tomponline <thomas.parrott@canonical.com>
-
tomponline authored
Signed-off-by:tomponline <thomas.parrott@canonical.com>
-
tomponline authored
Signed-off-by:tomponline <thomas.parrott@canonical.com>
-