- 19 Jan, 2018 1 commit
-
-
Christian Brauner authored
- mapped_hostid_entry() - idmap_add() Closes #2033. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
- 02 Jan, 2018 11 commits
-
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
We don't allow non-pty devices anyway so don't let open() create unneeded files. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Marcos Paulo de Souza authored
As the other tools already handle, show usage message when -h or --help are used. Signed-off-by:Marcos Paulo de Souza <marcos.souza.org@gmail.com>
-
Christian Brauner authored
The handler for the signal fd will detect when the init process of a container has exited and cause the mainloop to close. However, this can happen before the console handlers - or any other events for that matter - are handled. So in the case of init exiting we still need to allow for all buffered input to the console to be handled before exiting. This allows us to capture output from short-lived init processes. This is conceptually equivalent to my implementation of ExecReaderToChannel() https://github.com/lxc/lxd/blob/master/shared/util_linux.go#L527 Closes #1694. Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
This makes it clearer why handlers return what value. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
On set{g,u}id() the kernel does: /* dumpability changes */ if (!uid_eq(old->euid, new->euid) || !gid_eq(old->egid, new->egid) || !uid_eq(old->fsuid, new->fsuid) || !gid_eq(old->fsgid, new->fsgid) || !cred_cap_issubset(old, new)) { if (task->mm) set_dumpable(task->mm, suid_dumpable); task->pdeath_signal = 0; smp_wmb(); } which means we need to re-enable the deat signal after the set{g,u}id(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Since we are now dumpable we can open /proc/<child-pid>/ns/cgroup so let's avoid the overhead of sending around fds. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
When set set{u,g}id() the kernel will make us undumpable. This is unnecessary since we can guarantee that whatever is running inside the child process at this point this is fully trusted by the parent. Making us dumpable let's users use debuggers on the child process before the exec as well and also allows us to open /proc/<child-pid> files in lieu of the child. Note, that we only need to perform the prctl(PR_SET_DUMPABLE, ...) if our effective uid on the host is not 0. If our effective uid on the host is 0 then we will keep all capabilities in the child user namespace across set{g,u}id(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
- 01 Jan, 2018 11 commits
-
-
Christian Brauner authored
This way we can rely on the kernel's copy-on-write support similar to fork(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
This is a copy-on-write (no stack passed) variant of lxc_clone(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
This is to avoid bad surprises caused by older glibc's pid cache (up to 2.25) when using clone(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Because of older glibc's pid cache (up to 2.25) whenever clone() is called the child must must retrieve it's own pid via lxc_raw_getpid(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
- test CLONE_VFORK - test CLONE_FILES Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Receive fd for LSM security module before we set{g,u}id(). The reason is that on set{g,u}id() the kernel will a) make us undumpable and b) we will change our effective uid. This means our effective uid will be different from the effective uid of the process that created us which means that this processs no longer has capabilities in our namespace including CAP_SYS_PTRACE. This means we will not be able to read and /proc/<pid> files for the process anymore when /proc is mounted with hidepid={1,2}. So let's get the lsm label fd before the set{g,u}id(). Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
This let's us simplify the whole file a lot and makes things way clearer. It also let's us avoid the infamous pid cache. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Marcos Paulo de Souza authored
At this point, macros such DEBUG or ERROR does not take effect because this code is called from cgroup_ops_init(cgroup.c), which runs with __attribute__((constructor)), before any log level is set form any tool like lxc-start, so these messages are lost. For now on, use the same LXC_DEBUG_CGFSNG environment variable to control these messages. Signed-off-by:Marcos Paulo de Souza <marcos.souza.org@gmail.com>
-
独孤昊天 authored
if lxc_abstract_unix_connect fail and return -1, this code never goto retry. Signed-off-by:liuhao <liuhao27@huawei.com>
-
- 18 Dec, 2017 1 commit
-
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
- 17 Dec, 2017 16 commits
-
-
Christian Brauner authored
lxc.init.cmd is the new key that stable-2.0 doesn't know about. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
coverity: #1426132 coverity: #1426133 Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
This is based on raw_clone in systemd but adapted to our needs. The main reason is that we need an implementation of fork()/clone() that does guarantee us that no pthread_atfork() handlers are run. While clone() in glibc currently doesn't run pthread_atfork() handlers we should be fine but there's no guarantee that this won't be the case in the future. So let's do the syscall directly - or as direct as we can. An additional nice feature is that we get fork() behavior, i.e. lxc_raw_clone() returns 0 in the child and the child pid in the parent. Our implementation tries to make sure that we cover all cases according to kernel sources. Note that we are not interested in any arguments that could be passed after the stack. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
When we report STOPPED to a caller and then close the command socket it is technically possible - and I've seen this happen on the test builders - that a container start() right after a wait() will receive ECONNREFUSED because it called open() before we close(). So for all new state clients simply close the command socket. This will inform all state clients that the container is STOPPED and also prevents a race between a open()/close() on the command socket causing a new process to get ECONNREFUSED because we haven't yet closed the command socket. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Tycho Andersen authored
Signed-off-by:Tycho Andersen <tycho@tycho.ws>
-
Tycho Andersen authored
...otherwise we'll kill everyone on the machine. Instead, let's explicitly try to kill our children. Let's do a best effort against fork bombs by disabling forking via the pids cgroup if it exists. This is best effort for a number of reasons: * the pids cgroup may not be available * the container may have bind mounted /dev/null over pids.max, so the write doesn't do anything Signed-off-by:Tycho Andersen <tycho@tycho.ws>
-
Christian Brauner authored
Prior to this patch we raced with a very short-lived init process. Essentially, the init process could exit before we had time to record the cgroup namespace causing the container to abort and report ABORTING to the caller when it actually started just fine. Let's not do this. (This uses syscall(SYS_getpid) in the the child to retrieve the pid just in case we're on an older glibc version and we end up in the namespace sharing branch of the actual lxc_clone() call.) Additionally this fixes the shortlived tests. They were faulty so far and should have actually failed because of the cgroup namespace recording race but the ret variable used to return from the function was not correctly initialized. This fixes it. Furthermore, the shortlived tests used the c->error_num variable to determine success or failure but this is actually not correct when the container is started daemonized. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
The error_num value doesn't tell us anything since the container hasn't exited. Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
Starting with commit commit c5b93afb Author: Li Feng <lifeng68@huawei.com> Date: Mon Jul 10 17:19:52 2017 +0800 start: dup std{in,out,err} to pty slave In the case the container has a console with a valid slave pty file descriptor we duplicate std{in,out,err} to the slave file descriptor so console logging works correctly. When the container does not have a valid slave pty file descriptor for its console and is started daemonized we should dup to /dev/null. Closes #1646. Signed-off-by:Li Feng <lifeng68@huawei.com> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com> we made std{err,in,out} a duplicate of the slave file descriptor of the console if it existed. This meant we also duplicated all of them when we executed application containers in the foreground even if some std{err,in,out} file descriptor did not refer to a {p,t}ty. This blocked use cases such as: echo foo | lxc-execute -n -- cat which are very valid and common with application containers but less common with system containers where we don't have to care about this. So my suggestion is to unconditionally duplicate std{err,in,out} to the console file descriptor if we are either running daemonized - this ensures that daemonized application containers with a single bash shell keep on working - or when we are not running an application container. In other cases we only duplicate those file descriptors that actually refer to a {p,t}ty. This logic is similar to what we do for lxc-attach already. Refers to #1690. Closes #2028. Reported-by:
Felix Abecassis <fabecassis@nvidia.com> Signed-off-by:
Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
remove logically dead code Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
free allocated memory Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
check return value of snprintf() Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
remove logically dead code Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
initialize handler Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-
Christian Brauner authored
remove logically dead code Signed-off-by:Christian Brauner <christian.brauner@ubuntu.com>
-