Commit 9795e880 by Stéphane Graber Committed by GitHub

Merge pull request #1613 from brauner/2017-06-03/af_unix

abstract lxc_abstract_unix_{send,recv}_fd, bugfixes, and improvements
parents 3b011155 a394f952
...@@ -49,43 +49,71 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -49,43 +49,71 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<title>Description</title> <title>Description</title>
<para> <para>
The linux containers (<command>lxc</command>) are always created LXC is the well-known and heavily tested low-level Linux container
before being used. This creation defines a set of system runtime. It is in active development since 2008 and has proven itself in
resources to be virtualized / isolated when a process is using critical production environments world-wide. Some of its core contributors
the container. By default, the pids, sysv ipc and mount points are the same people that helped to implement various well-known
are virtualized and isolated. The other system resources are containerization features inside the Linux kernel.
shared across containers, until they are explicitly defined in
the configuration file. For example, if there is no network
configuration, the network will be shared between the creator of
the container and the container itself, but if the network is
specified, a new network stack is created for the container and
the container can no longer use the network of its ancestor.
</para> </para>
<para> <para>
The configuration file defines the different system resources to LXC's main focus is system containers. That is, containers which offer an
be assigned for the container. At present, the utsname, the environment as close as possible as the one you'd get from a VM but
network, the mount points, the root file system, the user namespace, without the overhead that comes with running a separate kernel and
and the control groups are supported. simulating all the hardware.
</para> </para>
<para> <para>
Each option in the configuration file has the form <command>key This is achieved through a combination of kernel security features such as
= value</command> fitting in one line. The '#' character means namespaces, mandatory access control and control groups.
the line is a comment. List options, like capabilities and cgroups </para>
options, can be used with no value to clear any previously
defined values of that option. <para>
LXC has supports unprivileged containers. Unprivileged containers are
containers that are run without any privilege. This requires support for
user namespaces in the kernel that the container is run on. LXC was the
first runtime to support unprivileged containers after user namespaces
were merged into the mainline kernel.
</para>
<para>
In essence, user namespaces isolate given sets of UIDs and GIDs. This is
achieved by establishing a mapping between a range of UIDs and GIDs on the
host to a different (unprivileged) range of UIDs and GIDs in the
container. The kernel will translate this mapping in such a way that
inside the container all UIDs and GIDs appear as you would expect from the
host whereas on the host these UIDs and GIDs are in fact unprivileged. For
example, a process running as UID and GID 0 inside the container might
appear as UID and GID 100000 on the host. The implementation and working
details can be gathered from the corresponding user namespace man page.
UID and GID mappings can be defined with the <option>lxc.id_map</option>
key.
</para>
<para>
Linux containers are defined with a simple configuration file. Each
option in the configuration file has the form <command>key =
value</command> fitting in one line. The "#" character means the line is a
comment. List options, like capabilities and cgroups options, can be used
with no value to clear any previously defined values of that option.
</para>
<para>
LXC namespaces configuration keys by using single dots. This means complex
configuration keys such as <option>lxc.network</option> expose various
subkeys such as <option>lxc.network.type</option>,
<option>lxc.network.link</option>, <option>lxc.network.ipv6</option>, and
others for even more fine-grained configuration.
</para> </para>
<refsect2> <refsect2>
<title>Configuration</title> <title>Configuration</title>
<para> <para>
In order to ease administration of multiple related containers, it In order to ease administration of multiple related containers, it is
is possible to have a container configuration file cause another possible to have a container configuration file cause another file to be
file to be loaded. For instance, network configuration loaded. For instance, network configuration can be defined in one common
can be defined in one common file which is included by multiple file which is included by multiple containers. Then, if the containers
containers. Then, if the containers are moved to another host, are moved to another host, only one file may need to be updated.
only one file may need to be updated.
</para> </para>
<variablelist> <variablelist>
...@@ -106,11 +134,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -106,11 +134,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<refsect2> <refsect2>
<title>Architecture</title> <title>Architecture</title>
<para> <para>
Allows one to set the architecture for the container. For example, Allows one to set the architecture for the container. For example, set a
set a 32bits architecture for a container running 32bits 32bits architecture for a container running 32bits binaries on a 64bits
binaries on a 64bits host. This fixes the container scripts host. This fixes the container scripts which rely on the architecture to
which rely on the architecture to do some work like do some work like downloading the packages.
downloading the packages.
</para> </para>
<variablelist> <variablelist>
...@@ -123,7 +150,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -123,7 +150,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Specify the architecture for the container. Specify the architecture for the container.
</para> </para>
<para> <para>
Valid options are Some valid options are
<option>x86</option>, <option>x86</option>,
<option>i686</option>, <option>i686</option>,
<option>x86_64</option>, <option>x86_64</option>,
...@@ -138,10 +165,9 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -138,10 +165,9 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<refsect2> <refsect2>
<title>Hostname</title> <title>Hostname</title>
<para> <para>
The utsname section defines the hostname to be set for the The utsname section defines the hostname to be set for the container.
container. That means the container can set its own hostname That means the container can set its own hostname without changing the
without changing the one from the system. That makes the one from the system. That makes the hostname private for the container.
hostname private for the container.
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
...@@ -160,12 +186,12 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -160,12 +186,12 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<refsect2> <refsect2>
<title>Halt signal</title> <title>Halt signal</title>
<para> <para>
Allows one to specify signal name or number, sent by lxc-stop to the Allows one to specify signal name or number sent to the container's
container's init process to cleanly shutdown the container. Different init process to cleanly shutdown the container. Different init systems
init systems could use different signals to perform clean shutdown could use different signals to perform clean shutdown sequence. This
sequence. This option allows the signal to be specified in kill(1) option allows the signal to be specified in kill(1) fashion, e.g.
fashion, e.g. SIGPWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The SIGPWR, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal is
default signal is SIGPWR. SIGPWR.
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
...@@ -184,10 +210,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -184,10 +210,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<refsect2> <refsect2>
<title>Reboot signal</title> <title>Reboot signal</title>
<para> <para>
Allows one to specify signal name or number, sent by lxc-stop to Allows one to specify signal name or number to reboot the container.
reboot the container. This option allows signal to be specified in This option allows signal to be specified in kill(1) fashion, e.g.
kill(1) fashion, e.g. SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. SIGTERM, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default signal
The default signal is SIGINT. is SIGINT.
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
...@@ -206,10 +232,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -206,10 +232,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<refsect2> <refsect2>
<title>Stop signal</title> <title>Stop signal</title>
<para> <para>
Allows one to specify signal name or number, sent by lxc-stop to forcibly Allows one to specify signal name or number to forcibly shutdown the
shutdown the container. This option allows signal to be specified in container. This option allows signal to be specified in kill(1) fashion,
kill(1) fashion, e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. e.g. SIGKILL, SIGRTMIN+14, SIGRTMAX-10 or plain number. The default
The default signal is SIGKILL. signal is SIGKILL.
</para> </para>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
...@@ -251,9 +277,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -251,9 +277,10 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<refsect2> <refsect2>
<title>Init ID</title> <title>Init ID</title>
<para> <para>
Sets the UID/GID to use for the init system, and subsequent command, executed by lxc-execute. Sets the UID/GID to use for the init system, and subsequent commands.
Note that using a non-root uid when booting a system container will
These options are only used when lxc-execute is started in a private user namespace. likely not work due to missing privileges. Setting the UID/GID is mostly
useful when running application container.
Defaults to: UID(0), GID(0) Defaults to: UID(0), GID(0)
</para> </para>
...@@ -264,7 +291,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -264,7 +291,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
</term> </term>
<listitem> <listitem>
<para> <para>
UID to use within a private user namesapce for init. UID to use for init.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -274,7 +301,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -274,7 +301,7 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
</term> </term>
<listitem> <listitem>
<para> <para>
GID to use within a private user namesapce for init. GID to use for init.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
...@@ -325,18 +352,22 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -325,18 +352,22 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.type</option> <option>lxc.network.[i].type</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify what kind of network virtualization to be used specify what kind of network virtualization to be used
for the container. Each time for the container.
a <option>lxc.network.type</option> field is found a new Multiple networks can be specified by using an additional index
round of network configuration begins. In this way, <option>i</option>
several network virtualization types can be specified after all <option>lxc.network.*</option> keys. For example,
for the same container, as well as assigning several <option>lxc.network.0.type = veth</option> and
network interfaces for one container. The different <option>lxc.network.1.type = veth</option> specify two different
virtualization types can be: networks of the same type. All keys sharing the same index
<option>i</option> will be treated as belonging to the same
network. For example, <option>lxc.network.0.link = br0</option>
will belong to <option>lxc.network.0.type</option>.
Currently, the different virtualization types can be:
</para> </para>
<para> <para>
...@@ -427,12 +458,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -427,12 +458,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.flags</option> <option>lxc.network.[i].flags</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify an action to do for the Specify an action to do for the network.
network.
</para> </para>
<para><option>up:</option> activates the interface. <para><option>up:</option> activates the interface.
...@@ -442,83 +472,76 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -442,83 +472,76 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.link</option> <option>lxc.network.[i].link</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify the interface to be used for real network Specify the interface to be used for real network traffic.
traffic. </para>
</para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.mtu</option> <option>lxc.network.[i].mtu</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify the maximum transfer unit for this interface. Specify the maximum transfer unit for this interface.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.name</option> <option>lxc.network.[i].name</option>
</term> </term>
<listitem> <listitem>
<para> <para>
the interface name is dynamically allocated, but if The interface name is dynamically allocated, but if another name
another name is needed because the configuration files is needed because the configuration files being used by the
being used by the container use a generic name, container use a generic name, eg. eth0, this option will rename
eg. eth0, this option will rename the interface in the the interface in the container.
container.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.hwaddr</option> <option>lxc.network.[i].hwaddr</option>
</term> </term>
<listitem> <listitem>
<para> <para>
the interface mac address is dynamically allocated by The interface mac address is dynamically allocated by default to
default to the virtual interface, but in some cases, the virtual interface, but in some cases, this is needed to
this is needed to resolve a mac address conflict or to resolve a mac address conflict or to always have the same
always have the same link-local ipv6 address. link-local ipv6 address. Any "x" in address will be replaced by
Any "x" in address will be replaced by random value, random value, this allows setting hwaddr templates.
this allows setting hwaddr templates.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.ipv4</option> <option>lxc.network.[i].ipv4</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify the ipv4 address to assign to the virtualized Specify the ipv4 address to assign to the virtualized interface.
interface. Several lines specify several ipv4 addresses. Several lines specify several ipv4 addresses. The address is in
The address is in format x.y.z.t/m, format x.y.z.t/m, eg. 192.168.1.123/24.
eg. 192.168.1.123/24. The broadcast address should be
specified on the same line, right after the ipv4
address.
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.ipv4.gateway</option> <option>lxc.network.[i].ipv4.gateway</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify the ipv4 address to use as the gateway inside the Specify the ipv4 address to use as the gateway inside the
container. The address is in format x.y.z.t, eg. container. The address is in format x.y.z.t, eg. 192.168.1.123.
192.168.1.123.
Can also have the special value <option>auto</option>, Can also have the special value <option>auto</option>,
which means to take the primary address from the bridge which means to take the primary address from the bridge
...@@ -534,27 +557,26 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -534,27 +557,26 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.ipv6</option> <option>lxc.network.[i].ipv6</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify the ipv6 address to assign to the virtualized Specify the ipv6 address to assign to the virtualized
interface. Several lines specify several ipv6 addresses. interface. Several lines specify several ipv6 addresses. The
The address is in format x::y/m, address is in format x::y/m, eg.
eg. 2003:db8:1:0:214:1234:fe0b:3596/64 2003:db8:1:0:214:1234:fe0b:3596/64
</para> </para>
</listitem> </listitem>
</varlistentry> </varlistentry>
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.ipv6.gateway</option> <option>lxc.network.[i].ipv6.gateway</option>
</term> </term>
<listitem> <listitem>
<para> <para>
specify the ipv6 address to use as the gateway inside the Specify the ipv6 address to use as the gateway inside the
container. The address is in format x::y, container. The address is in format x::y, eg. 2003:db8:1:0::1
eg. 2003:db8:1:0::1
Can also have the special value <option>auto</option>, Can also have the special value <option>auto</option>,
which means to take the primary address from the bridge which means to take the primary address from the bridge
...@@ -569,11 +591,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -569,11 +591,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.script.up</option> <option>lxc.network.[i].script.up</option>
</term> </term>
<listitem> <listitem>
<para> <para>
add a configuration option to specify a script to be Add a configuration option to specify a script to be
executed after creating and configuring the network used executed after creating and configuring the network used
from the host side. The following arguments are passed from the host side. The following arguments are passed
to the script: container name and config section name to the script: container name and config section name
...@@ -594,11 +616,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -594,11 +616,11 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
<varlistentry> <varlistentry>
<term> <term>
<option>lxc.network.script.down</option> <option>lxc.network.[i].script.down</option>
</term> </term>
<listitem> <listitem>
<para> <para>
add a configuration option to specify a script to be Add a configuration option to specify a script to be
executed before destroying the network used from the executed before destroying the network used from the
host side. The following arguments are passed to the host side. The following arguments are passed to the
script: container name and config section name (net) script: container name and config section name (net)
...@@ -822,9 +844,9 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ...@@ -822,9 +844,9 @@ Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
most cases should be a relative path, which will become most cases should be a relative path, which will become
relative to the mounted container root. For instance, relative to the mounted container root. For instance,
</para> </para>
<screen> <programlisting>
proc proc proc nodev,noexec,nosuid 0 0 proc proc proc nodev,noexec,nosuid 0 0
</screen> </programlisting>
<para> <para>
Will mount a proc filesystem under the container's /proc, Will mount a proc filesystem under the container's /proc,
regardless of where the root filesystem comes from. This regardless of where the root filesystem comes from. This
...@@ -1329,11 +1351,13 @@ proc proc proc nodev,noexec,nosuid 0 0 ...@@ -1329,11 +1351,13 @@ proc proc proc nodev,noexec,nosuid 0 0
allowed except for mknod, which will simply do nothing and allowed except for mknod, which will simply do nothing and
return 0 (success), looks like: return 0 (success), looks like:
</para> </para>
<screen>
2 <programlisting>
blacklist 2
mknod errno 0 blacklist
</screen> mknod errno 0
</programlisting>
<variablelist> <variablelist>
<varlistentry> <varlistentry>
<term> <term>
...@@ -1971,26 +1995,26 @@ mknod errno 0 ...@@ -1971,26 +1995,26 @@ mknod errno 0
mounting some locations and a changing root file system.</para> mounting some locations and a changing root file system.</para>
<programlisting> <programlisting>
lxc.utsname = complex lxc.utsname = complex
lxc.network.type = veth lxc.network.0.type = veth
lxc.network.flags = up lxc.network.0.flags = up
lxc.network.link = br0 lxc.network.0.link = br0
lxc.network.hwaddr = 4a:49:43:49:79:bf lxc.network.0.hwaddr = 4a:49:43:49:79:bf
lxc.network.ipv4 = 10.2.3.5/24 10.2.3.255 lxc.network.0.ipv4 = 10.2.3.5/24 10.2.3.255
lxc.network.ipv6 = 2003:db8:1:0:214:1234:fe0b:3597 lxc.network.0.ipv6 = 2003:db8:1:0:214:1234:fe0b:3597
lxc.network.ipv6 = 2003:db8:1:0:214:5432:feab:3588 lxc.network.0.ipv6 = 2003:db8:1:0:214:5432:feab:3588
lxc.network.type = macvlan lxc.network.1.type = macvlan
lxc.network.flags = up lxc.network.1.flags = up
lxc.network.link = eth0 lxc.network.1.link = eth0
lxc.network.hwaddr = 4a:49:43:49:79:bd lxc.network.1.hwaddr = 4a:49:43:49:79:bd
lxc.network.ipv4 = 10.2.3.4/24 lxc.network.1.ipv4 = 10.2.3.4/24
lxc.network.ipv4 = 192.168.10.125/24 lxc.network.1.ipv4 = 192.168.10.125/24
lxc.network.ipv6 = 2003:db8:1:0:214:1234:fe0b:3596 lxc.network.1.ipv6 = 2003:db8:1:0:214:1234:fe0b:3596
lxc.network.type = phys lxc.network.2.type = phys
lxc.network.flags = up lxc.network.2.flags = up
lxc.network.link = dummy0 lxc.network.2.link = dummy0
lxc.network.hwaddr = 4a:49:43:49:79:ff lxc.network.2.hwaddr = 4a:49:43:49:79:ff
lxc.network.ipv4 = 10.2.3.6/24 lxc.network.2.ipv4 = 10.2.3.6/24
lxc.network.ipv6 = 2003:db8:1:0:214:1234:fe0b:3297 lxc.network.2.ipv6 = 2003:db8:1:0:214:1234:fe0b:3297
lxc.cgroup.cpuset.cpus = 0,1 lxc.cgroup.cpuset.cpus = 0,1
lxc.cgroup.cpu.shares = 1234 lxc.cgroup.cpu.shares = 1234
lxc.cgroup.devices.deny = a lxc.cgroup.devices.deny = a
......
...@@ -22,6 +22,8 @@ ...@@ -22,6 +22,8 @@
*/ */
#include "config.h" #include "config.h"
#include <stdio.h>
#include <stdlib.h>
#include <stddef.h> #include <stddef.h>
#include <string.h> #include <string.h>
#include <unistd.h> #include <unistd.h>
...@@ -133,49 +135,66 @@ int lxc_abstract_unix_connect(const char *path) ...@@ -133,49 +135,66 @@ int lxc_abstract_unix_connect(const char *path)
return fd; return fd;
} }
int lxc_abstract_unix_send_fd(int fd, int sendfd, void *data, size_t size) int lxc_abstract_unix_send_fds(int fd, int *sendfds, int num_sendfds,
void *data, size_t size)
{ {
struct msghdr msg = { 0 }; int ret;
struct msghdr msg;
struct iovec iov; struct iovec iov;
struct cmsghdr *cmsg; struct cmsghdr *cmsg = NULL;
char cmsgbuf[CMSG_SPACE(sizeof(int))] = {0};
char buf[1] = {0}; char buf[1] = {0};
int *val; char *cmsgbuf;
size_t cmsgbufsize = CMSG_SPACE(num_sendfds * sizeof(int));
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));
cmsgbuf = malloc(cmsgbufsize);
if (!cmsgbuf)
return -1;
msg.msg_control = cmsgbuf; msg.msg_control = cmsgbuf;
msg.msg_controllen = sizeof(cmsgbuf); msg.msg_controllen = cmsgbufsize;
cmsg = CMSG_FIRSTHDR(&msg); cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_len = CMSG_LEN(sizeof(int));
cmsg->cmsg_level = SOL_SOCKET; cmsg->cmsg_level = SOL_SOCKET;
cmsg->cmsg_type = SCM_RIGHTS; cmsg->cmsg_type = SCM_RIGHTS;
val = (int *)(CMSG_DATA(cmsg)); cmsg->cmsg_len = CMSG_LEN(num_sendfds * sizeof(int));
*val = sendfd;
msg.msg_name = NULL; msg.msg_controllen = cmsg->cmsg_len;
msg.msg_namelen = 0;
memcpy(CMSG_DATA(cmsg), sendfds, num_sendfds * sizeof(int));
iov.iov_base = data ? data : buf; iov.iov_base = data ? data : buf;
iov.iov_len = data ? size : sizeof(buf); iov.iov_len = data ? size : sizeof(buf);
msg.msg_iov = &iov; msg.msg_iov = &iov;
msg.msg_iovlen = 1; msg.msg_iovlen = 1;
return sendmsg(fd, &msg, MSG_NOSIGNAL); ret = sendmsg(fd, &msg, MSG_NOSIGNAL);
free(cmsgbuf);
return ret;
} }
int lxc_abstract_unix_recv_fd(int fd, int *recvfd, void *data, size_t size) int lxc_abstract_unix_recv_fds(int fd, int *recvfds, int num_recvfds,
void *data, size_t size)
{ {
struct msghdr msg = { 0 }; int ret;
struct msghdr msg;
struct iovec iov; struct iovec iov;
struct cmsghdr *cmsg; struct cmsghdr *cmsg = NULL;
int ret, *val;
char cmsgbuf[CMSG_SPACE(sizeof(int))] = {0};
char buf[1] = {0}; char buf[1] = {0};
char *cmsgbuf;
size_t cmsgbufsize = CMSG_SPACE(num_recvfds * sizeof(int));
memset(&msg, 0, sizeof(msg));
memset(&iov, 0, sizeof(iov));
cmsgbuf = malloc(cmsgbufsize);
if (!cmsgbuf)
return -1;
msg.msg_name = NULL;
msg.msg_namelen = 0;
msg.msg_control = cmsgbuf; msg.msg_control = cmsgbuf;
msg.msg_controllen = sizeof(cmsgbuf); msg.msg_controllen = cmsgbufsize;
iov.iov_base = data ? data : buf; iov.iov_base = data ? data : buf;
iov.iov_len = data ? size : sizeof(buf); iov.iov_len = data ? size : sizeof(buf);
...@@ -188,17 +207,14 @@ int lxc_abstract_unix_recv_fd(int fd, int *recvfd, void *data, size_t size) ...@@ -188,17 +207,14 @@ int lxc_abstract_unix_recv_fd(int fd, int *recvfd, void *data, size_t size)
cmsg = CMSG_FIRSTHDR(&msg); cmsg = CMSG_FIRSTHDR(&msg);
/* if the message is wrong the variable will not be memset(recvfds, -1, num_recvfds * sizeof(int));
* filled and the peer will notified about a problem */ if (cmsg && cmsg->cmsg_len == CMSG_LEN(num_recvfds * sizeof(int)) &&
*recvfd = -1; cmsg->cmsg_level == SOL_SOCKET && cmsg->cmsg_type == SCM_RIGHTS) {
memcpy(recvfds, CMSG_DATA(cmsg), num_recvfds * sizeof(int));
if (cmsg && cmsg->cmsg_len == CMSG_LEN(sizeof(int)) &&
cmsg->cmsg_level == SOL_SOCKET &&
cmsg->cmsg_type == SCM_RIGHTS) {
val = (int *) CMSG_DATA(cmsg);
*recvfd = *val;
} }
out: out:
free(cmsgbuf);
return ret; return ret;
} }
......
...@@ -24,13 +24,17 @@ ...@@ -24,13 +24,17 @@
#ifndef __LXC_AF_UNIX_H #ifndef __LXC_AF_UNIX_H
#define __LXC_AF_UNIX_H #define __LXC_AF_UNIX_H
#include <stdio.h>
/* does not enforce \0-termination */ /* does not enforce \0-termination */
extern int lxc_abstract_unix_open(const char *path, int type, int flags); extern int lxc_abstract_unix_open(const char *path, int type, int flags);
extern int lxc_abstract_unix_close(int fd); extern int lxc_abstract_unix_close(int fd);
/* does not enforce \0-termination */ /* does not enforce \0-termination */
extern int lxc_abstract_unix_connect(const char *path); extern int lxc_abstract_unix_connect(const char *path);
extern int lxc_abstract_unix_send_fd(int fd, int sendfd, void *data, size_t size); extern int lxc_abstract_unix_send_fds(int fd, int *sendfds, int num_sendfds,
extern int lxc_abstract_unix_recv_fd(int fd, int *recvfd, void *data, size_t size); void *data, size_t size);
extern int lxc_abstract_unix_recv_fds(int fd, int *recvfds, int num_recvfds,
void *data, size_t size);
extern int lxc_abstract_unix_send_credential(int fd, void *data, size_t size); extern int lxc_abstract_unix_send_credential(int fd, void *data, size_t size);
extern int lxc_abstract_unix_rcv_credential(int fd, void *data, size_t size); extern int lxc_abstract_unix_rcv_credential(int fd, void *data, size_t size);
......
...@@ -986,7 +986,7 @@ int lxc_attach(const char* name, const char* lxcpath, lxc_attach_exec_t exec_fun ...@@ -986,7 +986,7 @@ int lxc_attach(const char* name, const char* lxcpath, lxc_attach_exec_t exec_fun
goto on_error; goto on_error;
/* Send child fd of the LSM security module to write to. */ /* Send child fd of the LSM security module to write to. */
ret = lxc_abstract_unix_send_fd(ipc_sockets[0], labelfd, NULL, 0); ret = lxc_abstract_unix_send_fds(ipc_sockets[0], &labelfd, 1, NULL, 0);
saved_errno = errno; saved_errno = errno;
close(labelfd); close(labelfd);
if (ret <= 0) { if (ret <= 0) {
...@@ -1273,7 +1273,7 @@ static int attach_child_main(void* data) ...@@ -1273,7 +1273,7 @@ static int attach_child_main(void* data)
if ((options->namespaces & CLONE_NEWNS) && (options->attach_flags & LXC_ATTACH_LSM) && init_ctx->lsm_label) { if ((options->namespaces & CLONE_NEWNS) && (options->attach_flags & LXC_ATTACH_LSM) && init_ctx->lsm_label) {
int on_exec; int on_exec;
/* Receive fd for LSM security module. */ /* Receive fd for LSM security module. */
ret = lxc_abstract_unix_recv_fd(ipc_socket, &lsm_labelfd, NULL, 0); ret = lxc_abstract_unix_recv_fds(ipc_socket, &lsm_labelfd, 1, NULL, 0);
if (ret <= 0) { if (ret <= 0) {
ERROR("Expected to receive file descriptor: %s.", strerror(errno)); ERROR("Expected to receive file descriptor: %s.", strerror(errno));
shutdown(ipc_socket, SHUT_RDWR); shutdown(ipc_socket, SHUT_RDWR);
......
...@@ -75,110 +75,110 @@ lxc_log_define(bdev, lxc); ...@@ -75,110 +75,110 @@ lxc_log_define(bdev, lxc);
/* aufs */ /* aufs */
static const struct bdev_ops aufs_ops = { static const struct bdev_ops aufs_ops = {
.detect = &aufs_detect, .detect = &aufs_detect,
.mount = &aufs_mount, .mount = &aufs_mount,
.umount = &aufs_umount, .umount = &aufs_umount,
.clone_paths = &aufs_clonepaths, .clone_paths = &aufs_clonepaths,
.destroy = &aufs_destroy, .destroy = &aufs_destroy,
.create = &aufs_create, .create = &aufs_create,
.can_snapshot = true, .can_snapshot = true,
.can_backup = true, .can_backup = true,
}; };
/* btrfs */ /* btrfs */
static const struct bdev_ops btrfs_ops = { static const struct bdev_ops btrfs_ops = {
.detect = &btrfs_detect, .detect = &btrfs_detect,
.mount = &btrfs_mount, .mount = &btrfs_mount,
.umount = &btrfs_umount, .umount = &btrfs_umount,
.clone_paths = &btrfs_clonepaths, .clone_paths = &btrfs_clonepaths,
.destroy = &btrfs_destroy, .destroy = &btrfs_destroy,
.create = &btrfs_create, .create = &btrfs_create,
.can_snapshot = true, .can_snapshot = true,
.can_backup = true, .can_backup = true,
}; };
/* dir */ /* dir */
static const struct bdev_ops dir_ops = { static const struct bdev_ops dir_ops = {
.detect = &dir_detect, .detect = &dir_detect,
.mount = &dir_mount, .mount = &dir_mount,
.umount = &dir_umount, .umount = &dir_umount,
.clone_paths = &dir_clonepaths, .clone_paths = &dir_clonepaths,
.destroy = &dir_destroy, .destroy = &dir_destroy,
.create = &dir_create, .create = &dir_create,
.can_snapshot = false, .can_snapshot = false,
.can_backup = true, .can_backup = true,
}; };
/* loop */ /* loop */
static const struct bdev_ops loop_ops = { static const struct bdev_ops loop_ops = {
.detect = &loop_detect, .detect = &loop_detect,
.mount = &loop_mount, .mount = &loop_mount,
.umount = &loop_umount, .umount = &loop_umount,
.clone_paths = &loop_clonepaths, .clone_paths = &loop_clonepaths,
.destroy = &loop_destroy, .destroy = &loop_destroy,
.create = &loop_create, .create = &loop_create,
.can_snapshot = false, .can_snapshot = false,
.can_backup = true, .can_backup = true,
}; };
/* lvm */ /* lvm */
static const struct bdev_ops lvm_ops = { static const struct bdev_ops lvm_ops = {
.detect = &lvm_detect, .detect = &lvm_detect,
.mount = &lvm_mount, .mount = &lvm_mount,
.umount = &lvm_umount, .umount = &lvm_umount,
.clone_paths = &lvm_clonepaths, .clone_paths = &lvm_clonepaths,
.destroy = &lvm_destroy, .destroy = &lvm_destroy,
.create = &lvm_create, .create = &lvm_create,
.can_snapshot = true, .can_snapshot = true,
.can_backup = false, .can_backup = false,
}; };
/* nbd */ /* nbd */
const struct bdev_ops nbd_ops = { const struct bdev_ops nbd_ops = {
.detect = &nbd_detect, .detect = &nbd_detect,
.mount = &nbd_mount, .mount = &nbd_mount,
.umount = &nbd_umount, .umount = &nbd_umount,
.clone_paths = &nbd_clonepaths, .clone_paths = &nbd_clonepaths,
.destroy = &nbd_destroy, .destroy = &nbd_destroy,
.create = &nbd_create, .create = &nbd_create,
.can_snapshot = true, .can_snapshot = true,
.can_backup = false, .can_backup = false,
}; };
/* overlay */ /* overlay */
static const struct bdev_ops ovl_ops = { static const struct bdev_ops ovl_ops = {
.detect = &ovl_detect, .detect = &ovl_detect,
.mount = &ovl_mount, .mount = &ovl_mount,
.umount = &ovl_umount, .umount = &ovl_umount,
.clone_paths = &ovl_clonepaths, .clone_paths = &ovl_clonepaths,
.destroy = &ovl_destroy, .destroy = &ovl_destroy,
.create = &ovl_create, .create = &ovl_create,
.can_snapshot = true, .can_snapshot = true,
.can_backup = true, .can_backup = true,
}; };
/* rbd */ /* rbd */
static const struct bdev_ops rbd_ops = { static const struct bdev_ops rbd_ops = {
.detect = &rbd_detect, .detect = &rbd_detect,
.mount = &rbd_mount, .mount = &rbd_mount,
.umount = &rbd_umount, .umount = &rbd_umount,
.clone_paths = &rbd_clonepaths, .clone_paths = &rbd_clonepaths,
.destroy = &rbd_destroy, .destroy = &rbd_destroy,
.create = &rbd_create, .create = &rbd_create,
.can_snapshot = false, .can_snapshot = false,
.can_backup = false, .can_backup = false,
}; };
/* zfs */ /* zfs */
static const struct bdev_ops zfs_ops = { static const struct bdev_ops zfs_ops = {
.detect = &zfs_detect, .detect = &zfs_detect,
.mount = &zfs_mount, .mount = &zfs_mount,
.umount = &zfs_umount, .umount = &zfs_umount,
.clone_paths = &zfs_clonepaths, .clone_paths = &zfs_clonepaths,
.destroy = &zfs_destroy, .destroy = &zfs_destroy,
.create = &zfs_create, .create = &zfs_create,
.can_snapshot = true, .can_snapshot = true,
.can_backup = true, .can_backup = true,
}; };
struct bdev_type { struct bdev_type {
...@@ -187,32 +187,33 @@ struct bdev_type { ...@@ -187,32 +187,33 @@ struct bdev_type {
}; };
static const struct bdev_type bdevs[] = { static const struct bdev_type bdevs[] = {
{.name = "zfs", .ops = &zfs_ops,}, { .name = "zfs", .ops = &zfs_ops, },
{.name = "lvm", .ops = &lvm_ops,}, { .name = "lvm", .ops = &lvm_ops, },
{.name = "rbd", .ops = &rbd_ops,}, { .name = "rbd", .ops = &rbd_ops, },
{.name = "btrfs", .ops = &btrfs_ops,}, { .name = "btrfs", .ops = &btrfs_ops, },
{.name = "dir", .ops = &dir_ops,}, { .name = "dir", .ops = &dir_ops, },
{.name = "aufs", .ops = &aufs_ops,}, { .name = "aufs", .ops = &aufs_ops, },
{.name = "overlayfs", .ops = &ovl_ops,}, { .name = "overlayfs", .ops = &ovl_ops, },
{.name = "loop", .ops = &loop_ops,}, { .name = "loop", .ops = &loop_ops, },
{.name = "nbd", .ops = &nbd_ops,}, { .name = "nbd", .ops = &nbd_ops, },
}; };
static const size_t numbdevs = sizeof(bdevs) / sizeof(struct bdev_type); static const size_t numbdevs = sizeof(bdevs) / sizeof(struct bdev_type);
/* helpers */ /* helpers */
static const struct bdev_type *bdev_query(struct lxc_conf *conf, const char *src); static const struct bdev_type *bdev_query(struct lxc_conf *conf,
const char *src);
static struct bdev *bdev_get(const char *type); static struct bdev *bdev_get(const char *type);
static struct bdev *do_bdev_create(const char *dest, const char *type, static struct bdev *do_bdev_create(const char *dest, const char *type,
const char *cname, struct bdev_specs *specs); const char *cname, struct bdev_specs *specs);
static int find_fstype_cb(char *buffer, void *data); static int find_fstype_cb(char *buffer, void *data);
static char *linkderef(char *path, char *dest); static char *linkderef(char *path, char *dest);
static bool unpriv_snap_allowed(struct bdev *b, const char *t, bool snap, static bool unpriv_snap_allowed(struct bdev *b, const char *t, bool snap,
bool maybesnap); bool maybesnap);
/* the bulk of this needs to become a common helper */ /* the bulk of this needs to become a common helper */
char *dir_new_path(char *src, const char *oldname, const char *name, char *dir_new_path(char *src, const char *oldname, const char *name,
const char *oldpath, const char *lxcpath) const char *oldpath, const char *lxcpath)
{ {
char *ret, *p, *p2; char *ret, *p, *p2;
int l1, l2, nlen; int l1, l2, nlen;
...@@ -244,11 +245,12 @@ char *dir_new_path(char *src, const char *oldname, const char *name, ...@@ -244,11 +245,12 @@ char *dir_new_path(char *src, const char *oldname, const char *name,
while ((p2 = strstr(src, oldname)) != NULL) { while ((p2 = strstr(src, oldname)) != NULL) {
strncpy(p, src, p2 - src); // copy text up to oldname strncpy(p, src, p2 - src); // copy text up to oldname
p += p2 - src; // move target pointer (p) p += p2 - src; // move target pointer (p)
p += sprintf(p, "%s", name); // print new name in place of oldname p += sprintf(p, "%s",
src = p2 + l2; // move src to end of oldname name); // print new name in place of oldname
src = p2 + l2; // move src to end of oldname
} }
sprintf(p, "%s", src); // copy the rest of src sprintf(p, "%s", src); // copy the rest of src
return ret; return ret;
} }
...@@ -264,15 +266,19 @@ bool attach_block_device(struct lxc_conf *conf) ...@@ -264,15 +266,19 @@ bool attach_block_device(struct lxc_conf *conf)
if (!conf->rootfs.path) if (!conf->rootfs.path)
return true; return true;
path = conf->rootfs.path; path = conf->rootfs.path;
if (!requires_nbd(path)) if (!requires_nbd(path))
return true; return true;
path = strchr(path, ':'); path = strchr(path, ':');
if (!path) if (!path)
return false; return false;
path++; path++;
if (!attach_nbd(path, conf)) if (!attach_nbd(path, conf))
return false; return false;
return true; return true;
} }
...@@ -283,6 +289,7 @@ bool bdev_can_backup(struct lxc_conf *conf) ...@@ -283,6 +289,7 @@ bool bdev_can_backup(struct lxc_conf *conf)
if (!bdev) if (!bdev)
return false; return false;
ret = bdev->ops->can_backup; ret = bdev->ops->can_backup;
bdev_put(bdev); bdev_put(bdev);
return ret; return ret;
...@@ -293,8 +300,8 @@ bool bdev_can_backup(struct lxc_conf *conf) ...@@ -293,8 +300,8 @@ bool bdev_can_backup(struct lxc_conf *conf)
* the original, mount the new, and rsync the contents. * the original, mount the new, and rsync the contents.
*/ */
struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
const char *lxcpath, const char *bdevtype, int flags, const char *lxcpath, const char *bdevtype, int flags,
const char *bdevdata, uint64_t newsize, int *needs_rdep) const char *bdevdata, uint64_t newsize, int *needs_rdep)
{ {
struct bdev *orig, *new; struct bdev *orig, *new;
pid_t pid; pid_t pid;
...@@ -311,8 +318,9 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -311,8 +318,9 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
* we don't know how to come up with a new name * we don't know how to come up with a new name
*/ */
if (strstr(src, oldname) == NULL) { if (strstr(src, oldname) == NULL) {
ERROR("original rootfs path %s doesn't include container name %s", ERROR(
src, oldname); "original rootfs path %s doesn't include container name %s",
src, oldname);
return NULL; return NULL;
} }
...@@ -334,6 +342,7 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -334,6 +342,7 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
bdev_put(orig); bdev_put(orig);
return NULL; return NULL;
} }
ret = snprintf(orig->dest, len, "%s/%s/rootfs", oldpath, oldname); ret = snprintf(orig->dest, len, "%s/%s/rootfs", oldpath, oldname);
if (ret < 0 || (size_t)ret >= len) { if (ret < 0 || (size_t)ret >= len) {
ERROR("rootfs path too long"); ERROR("rootfs path too long");
...@@ -341,9 +350,11 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -341,9 +350,11 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
return NULL; return NULL;
} }
ret = stat(orig->dest, &sb); ret = stat(orig->dest, &sb);
if (ret < 0 && errno == ENOENT) if (ret < 0 && errno == ENOENT)
if (mkdir_p(orig->dest, 0755) < 0) if (mkdir_p(orig->dest, 0755) < 0)
WARN("Error creating '%s', continuing.", orig->dest); WARN("Error creating '%s', continuing.",
orig->dest);
} }
/* /*
...@@ -357,7 +368,8 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -357,7 +368,8 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
/* /*
* If newtype is NULL and snapshot is set, then use overlayfs * If newtype is NULL and snapshot is set, then use overlayfs
*/ */
if (!bdevtype && !keepbdevtype && snap && strcmp(orig->type , "dir") == 0) if (!bdevtype && !keepbdevtype && snap &&
strcmp(orig->type, "dir") == 0)
bdevtype = "overlayfs"; bdevtype = "overlayfs";
if (am_unpriv() && !unpriv_snap_allowed(orig, bdevtype, snap, maybe_snap)) { if (am_unpriv() && !unpriv_snap_allowed(orig, bdevtype, snap, maybe_snap)) {
...@@ -368,23 +380,24 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -368,23 +380,24 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
*needs_rdep = 0; *needs_rdep = 0;
if (bdevtype && strcmp(orig->type, "dir") == 0 && if (bdevtype && strcmp(orig->type, "dir") == 0 &&
(strcmp(bdevtype, "aufs") == 0 || (strcmp(bdevtype, "aufs") == 0 ||
strcmp(bdevtype, "overlayfs") == 0)) { strcmp(bdevtype, "overlayfs") == 0)) {
*needs_rdep = 1; *needs_rdep = 1;
} else if (snap && strcmp(orig->type, "lvm") == 0 && } else if (snap && strcmp(orig->type, "lvm") == 0 &&
!lvm_is_thin_volume(orig->src)) { !lvm_is_thin_volume(orig->src)) {
*needs_rdep = 1; *needs_rdep = 1;
} }
new = bdev_get(bdevtype ? bdevtype : orig->type); new = bdev_get(bdevtype ? bdevtype : orig->type);
if (!new) { if (!new) {
ERROR("no such block device type: %s", bdevtype ? bdevtype : orig->type); ERROR("no such block device type: %s",
bdevtype ? bdevtype : orig->type);
bdev_put(orig); bdev_put(orig);
return NULL; return NULL;
} }
if (new->ops->clone_paths(orig, new, oldname, cname, oldpath, lxcpath, if (new->ops->clone_paths(orig, new, oldname, cname, oldpath, lxcpath,
snap, newsize, c0->lxc_conf) < 0) { snap, newsize, c0->lxc_conf) < 0) {
ERROR("failed getting pathnames for cloned storage: %s", src); ERROR("failed getting pathnames for cloned storage: %s", src);
goto err; goto err;
} }
...@@ -397,11 +410,12 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -397,11 +410,12 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
/* /*
* https://github.com/lxc/lxc/issues/131 * https://github.com/lxc/lxc/issues/131
* Use btrfs snapshot feature instead of rsync to restore if both orig and new are btrfs * Use btrfs snapshot feature instead of rsync to restore if both orig
* and new are btrfs
*/ */
if (bdevtype && if (bdevtype && strcmp(orig->type, "btrfs") == 0 &&
strcmp(orig->type, "btrfs") == 0 && strcmp(new->type, "btrfs") == 0 && strcmp(new->type, "btrfs") == 0 &&
btrfs_same_fs(orig->dest, new->dest) == 0) { btrfs_same_fs(orig->dest, new->dest) == 0) {
if (btrfs_destroy(new) < 0) { if (btrfs_destroy(new) < 0) {
ERROR("Error destroying %s subvolume", new->dest); ERROR("Error destroying %s subvolume", new->dest);
goto err; goto err;
...@@ -411,7 +425,8 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -411,7 +425,8 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
goto err; goto err;
} }
if (btrfs_snapshot(orig->dest, new->dest) < 0) { if (btrfs_snapshot(orig->dest, new->dest) < 0) {
ERROR("Error restoring %s to %s", orig->dest, new->dest); ERROR("Error restoring %s to %s", orig->dest,
new->dest);
goto err; goto err;
} }
bdev_put(orig); bdev_put(orig);
...@@ -437,7 +452,8 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, ...@@ -437,7 +452,8 @@ struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
data.orig = orig; data.orig = orig;
data.new = new; data.new = new;
if (am_unpriv()) if (am_unpriv())
ret = userns_exec_1(c0->lxc_conf, rsync_rootfs_wrapper, &data, "rsync_rootfs_wrapper"); ret = userns_exec_1(c0->lxc_conf, rsync_rootfs_wrapper, &data,
"rsync_rootfs_wrapper");
else else
ret = rsync_rootfs(&data); ret = rsync_rootfs(&data);
...@@ -461,7 +477,7 @@ err: ...@@ -461,7 +477,7 @@ err:
* @specs: details about the backing store to create, like fstype * @specs: details about the backing store to create, like fstype
*/ */
struct bdev *bdev_create(const char *dest, const char *type, const char *cname, struct bdev *bdev_create(const char *dest, const char *type, const char *cname,
struct bdev_specs *specs) struct bdev_specs *specs)
{ {
struct bdev *bdev; struct bdev *bdev;
char *best_options[] = {"btrfs", "zfs", "lvm", "dir", "rbd", NULL}; char *best_options[] = {"btrfs", "zfs", "lvm", "dir", "rbd", NULL};
...@@ -474,10 +490,13 @@ struct bdev *bdev_create(const char *dest, const char *type, const char *cname, ...@@ -474,10 +490,13 @@ struct bdev *bdev_create(const char *dest, const char *type, const char *cname,
// try for the best backing store type, according to our // try for the best backing store type, according to our
// opinionated preferences // opinionated preferences
for (i = 0; best_options[i]; i++) { for (i = 0; best_options[i]; i++) {
if ((bdev = do_bdev_create(dest, best_options[i], cname, specs))) if ((bdev = do_bdev_create(dest, best_options[i], cname,
specs)))
return bdev; return bdev;
} }
return NULL; // 'dir' should never fail, so this shouldn't happen
return NULL; // 'dir' should never fail, so this shouldn't
// happen
} }
// -B lvm,dir // -B lvm,dir
...@@ -485,7 +504,7 @@ struct bdev *bdev_create(const char *dest, const char *type, const char *cname, ...@@ -485,7 +504,7 @@ struct bdev *bdev_create(const char *dest, const char *type, const char *cname,
char *dup = alloca(strlen(type) + 1), *saveptr = NULL, *token; char *dup = alloca(strlen(type) + 1), *saveptr = NULL, *token;
strcpy(dup, type); strcpy(dup, type);
for (token = strtok_r(dup, ",", &saveptr); token; for (token = strtok_r(dup, ",", &saveptr); token;
token = strtok_r(NULL, ",", &saveptr)) { token = strtok_r(NULL, ",", &saveptr)) {
if ((bdev = do_bdev_create(dest, token, cname, specs))) if ((bdev = do_bdev_create(dest, token, cname, specs)))
return bdev; return bdev;
} }
...@@ -518,20 +537,23 @@ int bdev_destroy_wrapper(void *data) ...@@ -518,20 +537,23 @@ int bdev_destroy_wrapper(void *data)
ERROR("Failed to setgid to 0"); ERROR("Failed to setgid to 0");
return -1; return -1;
} }
if (setgroups(0, NULL) < 0) if (setgroups(0, NULL) < 0)
WARN("Failed to clear groups"); WARN("Failed to clear groups");
if (setuid(0) < 0) { if (setuid(0) < 0) {
ERROR("Failed to setuid to 0"); ERROR("Failed to setuid to 0");
return -1; return -1;
} }
if (!bdev_destroy(conf)) if (!bdev_destroy(conf))
return -1; return -1;
else
return 0; return 0;
} }
struct bdev *bdev_init(struct lxc_conf *conf, const char *src, const char *dst, struct bdev *bdev_init(struct lxc_conf *conf, const char *src, const char *dst,
const char *mntopts) const char *mntopts)
{ {
struct bdev *bdev; struct bdev *bdev;
const struct bdev_type *q; const struct bdev_type *q;
...@@ -549,6 +571,7 @@ struct bdev *bdev_init(struct lxc_conf *conf, const char *src, const char *dst, ...@@ -549,6 +571,7 @@ struct bdev *bdev_init(struct lxc_conf *conf, const char *src, const char *dst,
bdev = malloc(sizeof(struct bdev)); bdev = malloc(sizeof(struct bdev));
if (!bdev) if (!bdev)
return NULL; return NULL;
memset(bdev, 0, sizeof(struct bdev)); memset(bdev, 0, sizeof(struct bdev));
bdev->ops = q->ops; bdev->ops = q->ops;
bdev->type = q->name; bdev->type = q->name;
...@@ -620,7 +643,7 @@ void detach_block_device(struct lxc_conf *conf) ...@@ -620,7 +643,7 @@ void detach_block_device(struct lxc_conf *conf)
*/ */
int detect_fs(struct bdev *bdev, char *type, int len) int detect_fs(struct bdev *bdev, char *type, int len)
{ {
int p[2], ret; int p[2], ret;
size_t linelen; size_t linelen;
pid_t pid; pid_t pid;
FILE *f; FILE *f;
...@@ -637,8 +660,10 @@ int detect_fs(struct bdev *bdev, char *type, int len) ...@@ -637,8 +660,10 @@ int detect_fs(struct bdev *bdev, char *type, int len)
ret = pipe(p); ret = pipe(p);
if (ret < 0) if (ret < 0)
return -1; return -1;
if ((pid = fork()) < 0) if ((pid = fork()) < 0)
return -1; return -1;
if (pid > 0) { if (pid > 0) {
int status; int status;
close(p[1]); close(p[1]);
...@@ -664,7 +689,7 @@ int detect_fs(struct bdev *bdev, char *type, int len) ...@@ -664,7 +689,7 @@ int detect_fs(struct bdev *bdev, char *type, int len)
exit(1); exit(1);
if (detect_shared_rootfs()) { if (detect_shared_rootfs()) {
if (mount(NULL, "/", NULL, MS_SLAVE|MS_REC, NULL)) { if (mount(NULL, "/", NULL, MS_SLAVE | MS_REC, NULL)) {
SYSERROR("Failed to make / rslave"); SYSERROR("Failed to make / rslave");
ERROR("Continuing..."); ERROR("Continuing...");
} }
...@@ -672,9 +697,11 @@ int detect_fs(struct bdev *bdev, char *type, int len) ...@@ -672,9 +697,11 @@ int detect_fs(struct bdev *bdev, char *type, int len)
ret = mount_unknown_fs(srcdev, bdev->dest, bdev->mntopts); ret = mount_unknown_fs(srcdev, bdev->dest, bdev->mntopts);
if (ret < 0) { if (ret < 0) {
ERROR("failed mounting %s onto %s to detect fstype", srcdev, bdev->dest); ERROR("failed mounting %s onto %s to detect fstype", srcdev,
bdev->dest);
exit(1); exit(1);
} }
// if symlink, get the real dev name // if symlink, get the real dev name
char devpath[MAXPATHLEN]; char devpath[MAXPATHLEN];
char *l = linkderef(srcdev, devpath); char *l = linkderef(srcdev, devpath);
...@@ -683,6 +710,7 @@ int detect_fs(struct bdev *bdev, char *type, int len) ...@@ -683,6 +710,7 @@ int detect_fs(struct bdev *bdev, char *type, int len)
f = fopen("/proc/self/mounts", "r"); f = fopen("/proc/self/mounts", "r");
if (!f) if (!f)
exit(1); exit(1);
while (getline(&line, &linelen, f) != -1) { while (getline(&line, &linelen, f) != -1) {
sp1 = strchr(line, ' '); sp1 = strchr(line, ' ');
if (!sp1) if (!sp1)
...@@ -701,28 +729,38 @@ int detect_fs(struct bdev *bdev, char *type, int len) ...@@ -701,28 +729,38 @@ int detect_fs(struct bdev *bdev, char *type, int len)
sp2++; sp2++;
if (write(p[1], sp2, strlen(sp2)) != strlen(sp2)) if (write(p[1], sp2, strlen(sp2)) != strlen(sp2))
exit(1); exit(1);
exit(0); exit(0);
} }
exit(1); exit(1);
} }
int do_mkfs(const char *path, const char *fstype) int do_mkfs_exec_wrapper(void *args)
{ {
pid_t pid; int ret;
char *mkfs;
char **data = args;
/* strlen("mkfs.")
* +
* strlen(data[0])
* +
* \0
*/
size_t len = 5 + strlen(data[0]) + 1;
if ((pid = fork()) < 0) { mkfs = malloc(len);
ERROR("error forking"); if (!mkfs)
return -1; return -1;
}
if (pid > 0)
return wait_for_pid(pid);
// If the file is not a block device, we don't want mkfs to ask ret = snprintf(mkfs, len, "mkfs.%s", data[0]);
// us about whether to proceed. if (ret < 0 || (size_t)ret >= len)
if (null_stdfds() < 0) return -1;
exit(1);
execlp("mkfs", "mkfs", "-t", fstype, path, (char *)NULL); TRACE("executing \"%s %s\"", mkfs, data[1]);
exit(1); execlp(mkfs, mkfs, data[1], (char *)NULL);
SYSERROR("failed to run \"%s %s \"", mkfs, data[1]);
return -1;
} }
/* /*
...@@ -733,20 +771,23 @@ int is_blktype(struct bdev *b) ...@@ -733,20 +771,23 @@ int is_blktype(struct bdev *b)
{ {
if (strcmp(b->type, "lvm") == 0) if (strcmp(b->type, "lvm") == 0)
return 1; return 1;
return 0; return 0;
} }
int mount_unknown_fs(const char *rootfs, const char *target, int mount_unknown_fs(const char *rootfs, const char *target,
const char *options) const char *options)
{ {
size_t i;
int ret;
struct cbarg { struct cbarg {
const char *rootfs; const char *rootfs;
const char *target; const char *target;
const char *options; const char *options;
} cbarg = { } cbarg = {
.rootfs = rootfs, .rootfs = rootfs,
.target = target, .target = target,
.options = options, .options = options,
}; };
/* /*
...@@ -755,15 +796,11 @@ int mount_unknown_fs(const char *rootfs, const char *target, ...@@ -755,15 +796,11 @@ int mount_unknown_fs(const char *rootfs, const char *target,
* are auto-loaded and fall back to the supported kernel fs * are auto-loaded and fall back to the supported kernel fs
*/ */
char *fsfile[] = { char *fsfile[] = {
"/etc/filesystems", "/etc/filesystems",
"/proc/filesystems", "/proc/filesystems",
}; };
size_t i;
for (i = 0; i < sizeof(fsfile) / sizeof(fsfile[0]); i++) { for (i = 0; i < sizeof(fsfile) / sizeof(fsfile[0]); i++) {
int ret;
if (access(fsfile[i], F_OK)) if (access(fsfile[i], F_OK))
continue; continue;
...@@ -788,34 +825,38 @@ bool rootfs_is_blockdev(struct lxc_conf *conf) ...@@ -788,34 +825,38 @@ bool rootfs_is_blockdev(struct lxc_conf *conf)
int ret; int ret;
if (!conf->rootfs.path || strcmp(conf->rootfs.path, "/") == 0 || if (!conf->rootfs.path || strcmp(conf->rootfs.path, "/") == 0 ||
strlen(conf->rootfs.path) == 0) strlen(conf->rootfs.path) == 0)
return false; return false;
ret = stat(conf->rootfs.path, &st); ret = stat(conf->rootfs.path, &st);
if (ret == 0 && S_ISBLK(st.st_mode)) if (ret == 0 && S_ISBLK(st.st_mode))
return true; return true;
q = bdev_query(conf, conf->rootfs.path); q = bdev_query(conf, conf->rootfs.path);
if (!q) if (!q)
return false; return false;
if (strcmp(q->name, "lvm") == 0 || if (strcmp(q->name, "lvm") == 0 ||
strcmp(q->name, "loop") == 0 || strcmp(q->name, "loop") == 0 ||
strcmp(q->name, "nbd") == 0) strcmp(q->name, "nbd") == 0)
return true; return true;
return false; return false;
} }
static struct bdev *do_bdev_create(const char *dest, const char *type, static struct bdev *do_bdev_create(const char *dest, const char *type,
const char *cname, struct bdev_specs *specs) const char *cname, struct bdev_specs *specs)
{ {
struct bdev *bdev = bdev_get(type); struct bdev *bdev;
if (!bdev) {
bdev = bdev_get(type);
if (!bdev)
return NULL; return NULL;
}
if (bdev->ops->create(bdev, dest, cname, specs) < 0) { if (bdev->ops->create(bdev, dest, cname, specs) < 0) {
bdev_put(bdev); bdev_put(bdev);
return NULL; return NULL;
} }
return bdev; return bdev;
...@@ -830,14 +871,18 @@ static struct bdev *bdev_get(const char *type) ...@@ -830,14 +871,18 @@ static struct bdev *bdev_get(const char *type)
if (strcmp(bdevs[i].name, type) == 0) if (strcmp(bdevs[i].name, type) == 0)
break; break;
} }
if (i == numbdevs) if (i == numbdevs)
return NULL; return NULL;
bdev = malloc(sizeof(struct bdev)); bdev = malloc(sizeof(struct bdev));
if (!bdev) if (!bdev)
return NULL; return NULL;
memset(bdev, 0, sizeof(struct bdev)); memset(bdev, 0, sizeof(struct bdev));
bdev->ops = bdevs[i].ops; bdev->ops = bdevs[i].ops;
bdev->type = bdevs[i].name; bdev->type = bdevs[i].name;
return bdev; return bdev;
} }
...@@ -854,12 +899,16 @@ static const struct bdev_type *get_bdev_by_name(const char *name) ...@@ -854,12 +899,16 @@ static const struct bdev_type *get_bdev_by_name(const char *name)
return NULL; return NULL;
} }
static const struct bdev_type *bdev_query(struct lxc_conf *conf, const char *src) static const struct bdev_type *bdev_query(struct lxc_conf *conf,
const char *src)
{ {
size_t i; size_t i;
if (conf->rootfs.bdev_type) if (conf->rootfs.bdev_type) {
DEBUG("config file specified rootfs type \"%s\"",
conf->rootfs.bdev_type);
return get_bdev_by_name(conf->rootfs.bdev_type); return get_bdev_by_name(conf->rootfs.bdev_type);
}
for (i = 0; i < numbdevs; i++) { for (i = 0; i < numbdevs; i++) {
int r; int r;
...@@ -870,6 +919,9 @@ static const struct bdev_type *bdev_query(struct lxc_conf *conf, const char *src ...@@ -870,6 +919,9 @@ static const struct bdev_type *bdev_query(struct lxc_conf *conf, const char *src
if (i == numbdevs) if (i == numbdevs)
return NULL; return NULL;
DEBUG("detected rootfs type \"%s\"", bdevs[i].name);
return &bdevs[i]; return &bdevs[i];
} }
...@@ -878,7 +930,7 @@ static const struct bdev_type *bdev_query(struct lxc_conf *conf, const char *src ...@@ -878,7 +930,7 @@ static const struct bdev_type *bdev_query(struct lxc_conf *conf, const char *src
* the callback system, they can be pulled from there eventually, so we * the callback system, they can be pulled from there eventually, so we
* don't need to pollute utils.c with these low level functions * don't need to pollute utils.c with these low level functions
*/ */
static int find_fstype_cb(char* buffer, void *data) static int find_fstype_cb(char *buffer, void *data)
{ {
struct cbarg { struct cbarg {
const char *rootfs; const char *rootfs;
...@@ -898,8 +950,8 @@ static int find_fstype_cb(char* buffer, void *data) ...@@ -898,8 +950,8 @@ static int find_fstype_cb(char* buffer, void *data)
fstype += lxc_char_left_gc(fstype, strlen(fstype)); fstype += lxc_char_left_gc(fstype, strlen(fstype));
fstype[lxc_char_right_gc(fstype, strlen(fstype))] = '\0'; fstype[lxc_char_right_gc(fstype, strlen(fstype))] = '\0';
DEBUG("trying to mount '%s'->'%s' with fstype '%s'", DEBUG("trying to mount '%s'->'%s' with fstype '%s'", cbarg->rootfs,
cbarg->rootfs, cbarg->target, fstype); cbarg->target, fstype);
if (parse_mntopts(cbarg->options, &mntflags, &mntdata) < 0) { if (parse_mntopts(cbarg->options, &mntflags, &mntdata) < 0) {
free(mntdata); free(mntdata);
...@@ -914,8 +966,8 @@ static int find_fstype_cb(char* buffer, void *data) ...@@ -914,8 +966,8 @@ static int find_fstype_cb(char* buffer, void *data)
free(mntdata); free(mntdata);
INFO("mounted '%s' on '%s', with fstype '%s'", INFO("mounted '%s' on '%s', with fstype '%s'", cbarg->rootfs,
cbarg->rootfs, cbarg->target, fstype); cbarg->target, fstype);
return 1; return 1;
} }
...@@ -928,8 +980,10 @@ static char *linkderef(char *path, char *dest) ...@@ -928,8 +980,10 @@ static char *linkderef(char *path, char *dest)
ret = stat(path, &sbuf); ret = stat(path, &sbuf);
if (ret < 0) if (ret < 0)
return NULL; return NULL;
if (!S_ISLNK(sbuf.st_mode)) if (!S_ISLNK(sbuf.st_mode))
return path; return path;
ret = readlink(path, dest, MAXPATHLEN); ret = readlink(path, dest, MAXPATHLEN);
if (ret < 0) { if (ret < 0) {
SYSERROR("error reading link %s", path); SYSERROR("error reading link %s", path);
...@@ -939,6 +993,7 @@ static char *linkderef(char *path, char *dest) ...@@ -939,6 +993,7 @@ static char *linkderef(char *path, char *dest)
return NULL; return NULL;
} }
dest[ret] = '\0'; dest[ret] = '\0';
return dest; return dest;
} }
...@@ -946,43 +1001,46 @@ static char *linkderef(char *path, char *dest) ...@@ -946,43 +1001,46 @@ static char *linkderef(char *path, char *dest)
* is an unprivileged user allowed to make this kind of snapshot * is an unprivileged user allowed to make this kind of snapshot
*/ */
static bool unpriv_snap_allowed(struct bdev *b, const char *t, bool snap, static bool unpriv_snap_allowed(struct bdev *b, const char *t, bool snap,
bool maybesnap) bool maybesnap)
{ {
if (!t) { if (!t) {
// new type will be same as original // new type will be same as original
// (unless snap && b->type == dir, in which case it will be // (unless snap && b->type == dir, in which case it will be
// overlayfs -- which is also allowed) // overlayfs -- which is also allowed)
if (strcmp(b->type, "dir") == 0 || if (strcmp(b->type, "dir") == 0 ||
strcmp(b->type, "aufs") == 0 || strcmp(b->type, "aufs") == 0 ||
strcmp(b->type, "overlayfs") == 0 || strcmp(b->type, "overlayfs") == 0 ||
strcmp(b->type, "btrfs") == 0 || strcmp(b->type, "btrfs") == 0 ||
strcmp(b->type, "loop") == 0) strcmp(b->type, "loop") == 0)
return true; return true;
return false; return false;
} }
// unprivileged users can copy and snapshot dir, overlayfs, // unprivileged users can copy and snapshot dir, overlayfs,
// and loop. In particular, not zfs, btrfs, or lvm. // and loop. In particular, not zfs, btrfs, or lvm.
if (strcmp(t, "dir") == 0 || if (strcmp(t, "dir") == 0 ||
strcmp(t, "aufs") == 0 || strcmp(t, "aufs") == 0 ||
strcmp(t, "overlayfs") == 0 || strcmp(t, "overlayfs") == 0 ||
strcmp(t, "btrfs") == 0 || strcmp(t, "btrfs") == 0 ||
strcmp(t, "loop") == 0) strcmp(t, "loop") == 0)
return true; return true;
return false; return false;
} }
bool is_valid_bdev_type(const char *type) bool is_valid_bdev_type(const char *type)
{ {
if (strcmp(type, "dir") == 0 || if (strcmp(type, "dir") == 0 ||
strcmp(type, "btrfs") == 0 || strcmp(type, "btrfs") == 0 ||
strcmp(type, "aufs") == 0 || strcmp(type, "aufs") == 0 ||
strcmp(type, "loop") == 0 || strcmp(type, "loop") == 0 ||
strcmp(type, "lvm") == 0 || strcmp(type, "lvm") == 0 ||
strcmp(type, "nbd") == 0 || strcmp(type, "nbd") == 0 ||
strcmp(type, "overlayfs") == 0 || strcmp(type, "overlayfs") == 0 ||
strcmp(type, "rbd") == 0 || strcmp(type, "rbd") == 0 ||
strcmp(type, "zfs") == 0) strcmp(type, "zfs") == 0)
return true; return true;
return false; return false;
} }
...@@ -23,17 +23,13 @@ ...@@ -23,17 +23,13 @@
#ifndef __LXC_BDEV_H #ifndef __LXC_BDEV_H
#define __LXC_BDEV_H #define __LXC_BDEV_H
/* blockdev operations for:
* aufs, dir, raw, btrfs, overlayfs, aufs, lvm, loop, zfs, nbd (qcow2, raw, vdi, qed)
*/
#include <lxc/lxccontainer.h> #include "config.h"
#include <stdint.h> #include <stdint.h>
#include <sys/mount.h> #include <sys/mount.h>
#include "config.h" #include <lxc/lxccontainer.h>
/* define constants if the kernel/glibc headers don't define them */
#ifndef MS_DIRSYNC #ifndef MS_DIRSYNC
#define MS_DIRSYNC 128 #define MS_DIRSYNC 128
#endif #endif
...@@ -71,20 +67,21 @@ struct bdev_ops { ...@@ -71,20 +67,21 @@ struct bdev_ops {
int (*umount)(struct bdev *bdev); int (*umount)(struct bdev *bdev);
int (*destroy)(struct bdev *bdev); int (*destroy)(struct bdev *bdev);
int (*create)(struct bdev *bdev, const char *dest, const char *n, int (*create)(struct bdev *bdev, const char *dest, const char *n,
struct bdev_specs *specs); struct bdev_specs *specs);
/* given original mount, rename the paths for cloned container */ /* given original mount, rename the paths for cloned container */
int (*clone_paths)(struct bdev *orig, struct bdev *new, const char *oldname, int (*clone_paths)(struct bdev *orig, struct bdev *new,
const char *cname, const char *oldpath, const char *lxcpath, const char *oldname, const char *cname,
int snap, uint64_t newsize, struct lxc_conf *conf); const char *oldpath, const char *lxcpath, int snap,
uint64_t newsize, struct lxc_conf *conf);
bool can_snapshot; bool can_snapshot;
bool can_backup; bool can_backup;
}; };
/* /*
* When lxc-start (conf.c) is mounting a rootfs, then src will be the * When lxc-start is mounting a rootfs, then src will be the "lxc.rootfs" value,
* 'lxc.rootfs' value, dest will be mount dir (i.e. $libdir/lxc) When clone * dest will be mount dir (i.e. $libdir/lxc) When clone or create is doing so,
* or create is doing so, then dest will be $lxcpath/$lxcname/rootfs, since * then dest will be $lxcpath/$lxcname/rootfs, since we may need to rsync from
* we may need to rsync from one to the other. * one to the other.
* data is so far unused. * data is so far unused.
*/ */
struct bdev { struct bdev {
...@@ -93,10 +90,10 @@ struct bdev { ...@@ -93,10 +90,10 @@ struct bdev {
char *src; char *src;
char *dest; char *dest;
char *mntopts; char *mntopts;
// turn the following into a union if need be /* Turn the following into a union if need be. */
// lofd is the open fd for the mounted loopback file /* lofd is the open fd for the mounted loopback file. */
int lofd; int lofd;
// index for the connected nbd device /* index for the connected nbd device. */
int nbd_idx; int nbd_idx;
}; };
...@@ -104,27 +101,27 @@ bool bdev_is_dir(struct lxc_conf *conf, const char *path); ...@@ -104,27 +101,27 @@ bool bdev_is_dir(struct lxc_conf *conf, const char *path);
bool bdev_can_backup(struct lxc_conf *conf); bool bdev_can_backup(struct lxc_conf *conf);
/* /*
* Instantiate a bdev object. The src is used to determine which blockdev * Instantiate a bdev object. The src is used to determine which blockdev type
* type this should be. The dst and data are optional, and will be used * this should be. The dst and data are optional, and will be used in case of
* in case of mount/umount. * mount/umount.
* *
* Optionally, src can be 'dir:/var/lib/lxc/c1' or 'lvm:/dev/lxc/c1'. For * Optionally, src can be 'dir:/var/lib/lxc/c1' or 'lvm:/dev/lxc/c1'. For
* other backing stores, this will allow additional options. In particular, * other backing stores, this will allow additional options. In particular,
* "overlayfs:/var/lib/lxc/canonical/rootfs:/var/lib/lxc/c1/delta" will mean * "overlayfs:/var/lib/lxc/canonical/rootfs:/var/lib/lxc/c1/delta" will mean
* use /var/lib/lxc/canonical/rootfs as lower dir, and /var/lib/lxc/c1/delta * use /var/lib/lxc/canonical/rootfs as lower dir, and /var/lib/lxc/c1/delta
* as the upper, writeable layer. * as the upper, writeable layer.
*/ */
struct bdev *bdev_init(struct lxc_conf *conf, const char *src, const char *dst, struct bdev *bdev_init(struct lxc_conf *conf, const char *src, const char *dst,
const char *data); const char *data);
struct bdev *bdev_copy(struct lxc_container *c0, const char *cname, struct bdev *bdev_copy(struct lxc_container *c0, const char *cname,
const char *lxcpath, const char *bdevtype, const char *lxcpath, const char *bdevtype, int flags,
int flags, const char *bdevdata, uint64_t newsize, const char *bdevdata, uint64_t newsize, int *needs_rdep);
int *needs_rdep); struct bdev *bdev_create(const char *dest, const char *type, const char *cname,
struct bdev *bdev_create(const char *dest, const char *type, struct bdev_specs *specs);
const char *cname, struct bdev_specs *specs);
void bdev_put(struct bdev *bdev); void bdev_put(struct bdev *bdev);
bool bdev_destroy(struct lxc_conf *conf); bool bdev_destroy(struct lxc_conf *conf);
/* callback function to be used with userns_exec_1() */ /* callback function to be used with userns_exec_1() */
int bdev_destroy_wrapper(void *data); int bdev_destroy_wrapper(void *data);
...@@ -134,11 +131,12 @@ int bdev_destroy_wrapper(void *data); ...@@ -134,11 +131,12 @@ int bdev_destroy_wrapper(void *data);
*/ */
int blk_getsize(struct bdev *bdev, uint64_t *size); int blk_getsize(struct bdev *bdev, uint64_t *size);
int detect_fs(struct bdev *bdev, char *type, int len); int detect_fs(struct bdev *bdev, char *type, int len);
int do_mkfs(const char *path, const char *fstype); int do_mkfs_exec_wrapper(void *args);
int is_blktype(struct bdev *b); int is_blktype(struct bdev *b);
int mount_unknown_fs(const char *rootfs, const char *target, int mount_unknown_fs(const char *rootfs, const char *target,
const char *options); const char *options);
bool rootfs_is_blockdev(struct lxc_conf *conf); bool rootfs_is_blockdev(struct lxc_conf *conf);
/* /*
* these are really for qemu-nbd support, as container shutdown * these are really for qemu-nbd support, as container shutdown
* must explicitly request device detach. * must explicitly request device detach.
......
...@@ -28,6 +28,7 @@ ...@@ -28,6 +28,7 @@
#include <string.h> #include <string.h>
#include <unistd.h> #include <unistd.h>
#include <linux/loop.h> #include <linux/loop.h>
#include <sys/stat.h>
#include <sys/types.h> #include <sys/types.h>
#include "bdev.h" #include "bdev.h"
...@@ -157,8 +158,19 @@ int loop_destroy(struct bdev *orig) ...@@ -157,8 +158,19 @@ int loop_destroy(struct bdev *orig)
int loop_detect(const char *path) int loop_detect(const char *path)
{ {
int ret;
struct stat s;
if (strncmp(path, "loop:", 5) == 0) if (strncmp(path, "loop:", 5) == 0)
return 1; return 1;
ret = stat(path, &s);
if (ret < 0)
return 0;
if (__S_ISTYPE(s.st_mode, S_IFREG))
return 1;
return 0; return 0;
} }
...@@ -166,15 +178,23 @@ int loop_mount(struct bdev *bdev) ...@@ -166,15 +178,23 @@ int loop_mount(struct bdev *bdev)
{ {
int ret, loopfd; int ret, loopfd;
char loname[MAXPATHLEN]; char loname[MAXPATHLEN];
char *src = bdev->src;
if (strcmp(bdev->type, "loop")) if (strcmp(bdev->type, "loop"))
return -22; return -22;
if (!bdev->src || !bdev->dest) if (!bdev->src || !bdev->dest)
return -22; return -22;
loopfd = lxc_prepare_loop_dev(bdev->src + 5, loname, LO_FLAGS_AUTOCLEAR); /* skip prefix */
if (loopfd < 0) if (!strncmp(bdev->src, "loop:", 5))
src += 5;
loopfd = lxc_prepare_loop_dev(src, loname, LO_FLAGS_AUTOCLEAR);
if (loopfd < 0) {
ERROR("failed to prepare loop device for loop file \"%s\"", src);
return -1; return -1;
}
DEBUG("prepared loop device \"%s\"", loname); DEBUG("prepared loop device \"%s\"", loname);
ret = mount_unknown_fs(loname, bdev->dest, bdev->mntopts); ret = mount_unknown_fs(loname, bdev->dest, bdev->mntopts);
...@@ -206,6 +226,9 @@ int loop_umount(struct bdev *bdev) ...@@ -206,6 +226,9 @@ int loop_umount(struct bdev *bdev)
static int do_loop_create(const char *path, uint64_t size, const char *fstype) static int do_loop_create(const char *path, uint64_t size, const char *fstype)
{ {
int fd, ret; int fd, ret;
const char *cmd_args[2] = {fstype, path};
char cmd_output[MAXPATHLEN];
// create the new loopback file. // create the new loopback file.
fd = creat(path, S_IRUSR|S_IWUSR); fd = creat(path, S_IRUSR|S_IWUSR);
if (fd < 0) if (fd < 0)
...@@ -227,11 +250,10 @@ static int do_loop_create(const char *path, uint64_t size, const char *fstype) ...@@ -227,11 +250,10 @@ static int do_loop_create(const char *path, uint64_t size, const char *fstype)
} }
// create an fs in the loopback file // create an fs in the loopback file
if (do_mkfs(path, fstype) < 0) { ret = run_command(cmd_output, sizeof(cmd_output), do_mkfs_exec_wrapper,
ERROR("Error creating filesystem type %s on %s", fstype, (void *)cmd_args);
path); if (ret < 0)
return -1; return -1;
}
return 0; return 0;
} }
...@@ -282,6 +282,8 @@ int lvm_clonepaths(struct bdev *orig, struct bdev *new, const char *oldname, ...@@ -282,6 +282,8 @@ int lvm_clonepaths(struct bdev *orig, struct bdev *new, const char *oldname,
char fstype[100]; char fstype[100];
uint64_t size = newsize; uint64_t size = newsize;
int len, ret; int len, ret;
const char *cmd_args[2];
char cmd_output[MAXPATHLEN];
if (!orig->src || !orig->dest) if (!orig->src || !orig->dest)
return -1; return -1;
...@@ -348,11 +350,14 @@ int lvm_clonepaths(struct bdev *orig, struct bdev *new, const char *oldname, ...@@ -348,11 +350,14 @@ int lvm_clonepaths(struct bdev *orig, struct bdev *new, const char *oldname,
ERROR("Error creating new lvm blockdev"); ERROR("Error creating new lvm blockdev");
return -1; return -1;
} }
if (do_mkfs(new->src, fstype) < 0) {
ERROR("Error creating filesystem type %s on %s", fstype, cmd_args[0] = fstype;
new->src); cmd_args[1] = new->src;
// create an fs in the loopback file
ret = run_command(cmd_output, sizeof(cmd_output),
do_mkfs_exec_wrapper, (void *)cmd_args);
if (ret < 0)
return -1; return -1;
}
} }
return 0; return 0;
...@@ -378,6 +383,8 @@ int lvm_create(struct bdev *bdev, const char *dest, const char *n, ...@@ -378,6 +383,8 @@ int lvm_create(struct bdev *bdev, const char *dest, const char *n,
const char *vg, *thinpool, *fstype, *lv = n; const char *vg, *thinpool, *fstype, *lv = n;
uint64_t sz; uint64_t sz;
int ret, len; int ret, len;
const char *cmd_args[2];
char cmd_output[MAXPATHLEN];
if (!specs) if (!specs)
return -1; return -1;
...@@ -416,11 +423,14 @@ int lvm_create(struct bdev *bdev, const char *dest, const char *n, ...@@ -416,11 +423,14 @@ int lvm_create(struct bdev *bdev, const char *dest, const char *n,
fstype = specs->fstype; fstype = specs->fstype;
if (!fstype) if (!fstype)
fstype = DEFAULT_FSTYPE; fstype = DEFAULT_FSTYPE;
if (do_mkfs(bdev->src, fstype) < 0) {
ERROR("Error creating filesystem type %s on %s", fstype, cmd_args[0] = fstype;
bdev->src); cmd_args[1] = bdev->src;
ret = run_command(cmd_output, sizeof(cmd_output), do_mkfs_exec_wrapper,
(void *)cmd_args);
if (ret < 0)
return -1; return -1;
}
if (!(bdev->dest = strdup(dest))) if (!(bdev->dest = strdup(dest)))
return -1; return -1;
......
...@@ -51,6 +51,8 @@ int rbd_create(struct bdev *bdev, const char *dest, const char *n, ...@@ -51,6 +51,8 @@ int rbd_create(struct bdev *bdev, const char *dest, const char *n,
int ret, len; int ret, len;
char sz[24]; char sz[24];
pid_t pid; pid_t pid;
const char *cmd_args[2];
char cmd_output[MAXPATHLEN];
if (!specs) if (!specs)
return -1; return -1;
...@@ -104,11 +106,13 @@ int rbd_create(struct bdev *bdev, const char *dest, const char *n, ...@@ -104,11 +106,13 @@ int rbd_create(struct bdev *bdev, const char *dest, const char *n,
if (!fstype) if (!fstype)
fstype = DEFAULT_FSTYPE; fstype = DEFAULT_FSTYPE;
if (do_mkfs(bdev->src, fstype) < 0) { cmd_args[0] = fstype;
ERROR("Error creating filesystem type %s on %s", fstype, cmd_args[1] = bdev->src;
bdev->src); ret = run_command(cmd_output, sizeof(cmd_output), do_mkfs_exec_wrapper,
(void *)cmd_args);
if (ret < 0)
return -1; return -1;
}
if (!(bdev->dest = strdup(dest))) if (!(bdev->dest = strdup(dest)))
return -1; return -1;
......
...@@ -171,7 +171,7 @@ static int lxc_cmd_rsp_recv(int sock, struct lxc_cmd_rr *cmd) ...@@ -171,7 +171,7 @@ static int lxc_cmd_rsp_recv(int sock, struct lxc_cmd_rr *cmd)
int ret,rspfd; int ret,rspfd;
struct lxc_cmd_rsp *rsp = &cmd->rsp; struct lxc_cmd_rsp *rsp = &cmd->rsp;
ret = lxc_abstract_unix_recv_fd(sock, &rspfd, rsp, sizeof(*rsp)); ret = lxc_abstract_unix_recv_fds(sock, &rspfd, 1, rsp, sizeof(*rsp));
if (ret < 0) { if (ret < 0) {
WARN("Command %s failed to receive response: %s.", WARN("Command %s failed to receive response: %s.",
lxc_cmd_str(cmd->req.cmd), strerror(errno)); lxc_cmd_str(cmd->req.cmd), strerror(errno));
...@@ -756,7 +756,7 @@ static int lxc_cmd_console_callback(int fd, struct lxc_cmd_req *req, ...@@ -756,7 +756,7 @@ static int lxc_cmd_console_callback(int fd, struct lxc_cmd_req *req,
memset(&rsp, 0, sizeof(rsp)); memset(&rsp, 0, sizeof(rsp));
rsp.data = INT_TO_PTR(ttynum); rsp.data = INT_TO_PTR(ttynum);
if (lxc_abstract_unix_send_fd(fd, masterfd, &rsp, sizeof(rsp)) < 0) { if (lxc_abstract_unix_send_fds(fd, &masterfd, 1, &rsp, sizeof(rsp)) < 0) {
ERROR("Failed to send tty to client."); ERROR("Failed to send tty to client.");
lxc_console_free(handler->conf, fd); lxc_console_free(handler->conf, fd);
goto out_close; goto out_close;
......
...@@ -172,11 +172,6 @@ static int sethostname(const char * name, size_t len) ...@@ -172,11 +172,6 @@ static int sethostname(const char * name, size_t len)
} }
#endif #endif
/* Define __S_ISTYPE if missing from the C library */
#ifndef __S_ISTYPE
#define __S_ISTYPE(mode, mask) (((mode) & S_IFMT) == (mask))
#endif
#ifndef MS_PRIVATE #ifndef MS_PRIVATE
#define MS_PRIVATE (1<<18) #define MS_PRIVATE (1<<18)
#endif #endif
...@@ -585,49 +580,6 @@ static int run_script(const char *name, const char *section, const char *script, ...@@ -585,49 +580,6 @@ static int run_script(const char *name, const char *section, const char *script,
return run_buffer(buffer); return run_buffer(buffer);
} }
static int mount_rootfs_dir(const char *rootfs, const char *target,
const char *options)
{
unsigned long mntflags;
char *mntdata;
int ret;
if (parse_mntopts(options, &mntflags, &mntdata) < 0) {
free(mntdata);
return -1;
}
ret = mount(rootfs, target, "none", MS_BIND | MS_REC | mntflags, mntdata);
free(mntdata);
return ret;
}
static int lxc_mount_rootfs_file(const char *rootfs, const char *target,
const char *options)
{
int ret, loopfd;
char path[MAXPATHLEN];
loopfd = lxc_prepare_loop_dev(rootfs, path, LO_FLAGS_AUTOCLEAR);
if (loopfd < 0)
return -1;
DEBUG("prepared loop device \"%s\"", path);
ret = mount_unknown_fs(path, target, options);
close(loopfd);
DEBUG("mounted rootfs \"%s\" on loop device \"%s\" via loop device \"%s\"", rootfs, target, path);
return ret;
}
static int mount_rootfs_block(const char *rootfs, const char *target,
const char *options)
{
return mount_unknown_fs(rootfs, target, options);
}
/* /*
* pin_rootfs * pin_rootfs
* if rootfs is a directory, then open ${rootfs}/lxc.hold for writing for * if rootfs is a directory, then open ${rootfs}/lxc.hold for writing for
...@@ -836,49 +788,6 @@ static int lxc_mount_auto_mounts(struct lxc_conf *conf, int flags, struct lxc_ha ...@@ -836,49 +788,6 @@ static int lxc_mount_auto_mounts(struct lxc_conf *conf, int flags, struct lxc_ha
return 0; return 0;
} }
static int mount_rootfs(const char *rootfs, const char *target, const char *options)
{
char absrootfs[MAXPATHLEN];
struct stat s;
int i;
typedef int (*rootfs_cb)(const char *, const char *, const char *);
struct rootfs_type {
int type;
rootfs_cb cb;
} rtfs_type[] = {
{ S_IFDIR, mount_rootfs_dir },
{ S_IFBLK, mount_rootfs_block },
{ S_IFREG, lxc_mount_rootfs_file },
};
if (!realpath(rootfs, absrootfs)) {
SYSERROR("Failed to get real path for \"%s\".", rootfs);
return -1;
}
if (access(absrootfs, F_OK)) {
SYSERROR("The rootfs \"%s\" is not accessible.", absrootfs);
return -1;
}
if (stat(absrootfs, &s)) {
SYSERROR("Failed to stat the rootfs \"%s\".", absrootfs);
return -1;
}
for (i = 0; i < sizeof(rtfs_type)/sizeof(rtfs_type[0]); i++) {
if (!__S_ISTYPE(s.st_mode, rtfs_type[i].type))
continue;
return rtfs_type[i].cb(absrootfs, target, options);
}
ERROR("Unsupported rootfs type for rootfs \"%s\".", absrootfs);
return -1;
}
static int setup_utsname(struct utsname *utsname) static int setup_utsname(struct utsname *utsname)
{ {
if (!utsname) if (!utsname)
...@@ -1258,8 +1167,9 @@ static int lxc_fill_autodev(const struct lxc_rootfs *rootfs) ...@@ -1258,8 +1167,9 @@ static int lxc_fill_autodev(const struct lxc_rootfs *rootfs)
return 0; return 0;
} }
static int setup_rootfs(struct lxc_conf *conf) static int lxc_setup_rootfs(struct lxc_conf *conf)
{ {
int ret;
struct bdev *bdev; struct bdev *bdev;
const struct lxc_rootfs *rootfs; const struct lxc_rootfs *rootfs;
...@@ -1278,18 +1188,17 @@ static int setup_rootfs(struct lxc_conf *conf) ...@@ -1278,18 +1188,17 @@ static int setup_rootfs(struct lxc_conf *conf)
return -1; return -1;
} }
/* First try mounting rootfs using a bdev. */
bdev = bdev_init(conf, rootfs->path, rootfs->mount, rootfs->options); bdev = bdev_init(conf, rootfs->path, rootfs->mount, rootfs->options);
if (bdev && !bdev->ops->mount(bdev)) { if (!bdev) {
bdev_put(bdev); ERROR("Failed to mount rootfs \"%s\" onto \"%s\" with options \"%s\".",
DEBUG("Mounted rootfs \"%s\" onto \"%s\" with options \"%s\".",
rootfs->path, rootfs->mount, rootfs->path, rootfs->mount,
rootfs->options ? rootfs->options : "(null)"); rootfs->options ? rootfs->options : "(null)");
return 0; return -1;
} }
if (bdev)
bdev_put(bdev); ret = bdev->ops->mount(bdev);
if (mount_rootfs(rootfs->path, rootfs->mount, rootfs->options)) { bdev_put(bdev);
if (ret < 0) {
ERROR("Failed to mount rootfs \"%s\" onto \"%s\" with options \"%s\".", ERROR("Failed to mount rootfs \"%s\" onto \"%s\" with options \"%s\".",
rootfs->path, rootfs->mount, rootfs->path, rootfs->mount,
rootfs->options ? rootfs->options : "(null)"); rootfs->options ? rootfs->options : "(null)");
...@@ -1299,6 +1208,7 @@ static int setup_rootfs(struct lxc_conf *conf) ...@@ -1299,6 +1208,7 @@ static int setup_rootfs(struct lxc_conf *conf)
DEBUG("Mounted rootfs \"%s\" onto \"%s\" with options \"%s\".", DEBUG("Mounted rootfs \"%s\" onto \"%s\" with options \"%s\".",
rootfs->path, rootfs->mount, rootfs->path, rootfs->mount,
rootfs->options ? rootfs->options : "(null)"); rootfs->options ? rootfs->options : "(null)");
return 0; return 0;
} }
...@@ -3418,7 +3328,14 @@ static int write_id_mapping(enum idtype idtype, pid_t pid, const char *buf, ...@@ -3418,7 +3328,14 @@ static int write_id_mapping(enum idtype idtype, pid_t pid, const char *buf,
return 0; return 0;
} }
/* Check whether a binary exist and has either CAP_SETUID, CAP_SETGID or both. */ /* Check whether a binary exist and has either CAP_SETUID, CAP_SETGID or both.
*
* @return 1 if functional binary was found
* @return 0 if binary exists but is lacking privilege
* @return -ENOENT if binary does not exist
* @return -EINVAL if cap to check is neither CAP_SETUID nor CAP_SETGID
*
*/
static int idmaptool_on_path_and_privileged(const char *binary, cap_value_t cap) static int idmaptool_on_path_and_privileged(const char *binary, cap_value_t cap)
{ {
char *path; char *path;
...@@ -3426,6 +3343,9 @@ static int idmaptool_on_path_and_privileged(const char *binary, cap_value_t cap) ...@@ -3426,6 +3343,9 @@ static int idmaptool_on_path_and_privileged(const char *binary, cap_value_t cap)
struct stat st; struct stat st;
int fret = 0; int fret = 0;
if (cap != CAP_SETUID && cap != CAP_SETGID)
return -EINVAL;
path = on_path(binary, NULL); path = on_path(binary, NULL);
if (!path) if (!path)
return -ENOENT; return -ENOENT;
...@@ -3514,7 +3434,17 @@ int lxc_map_ids(struct lxc_list *idmap, pid_t pid) ...@@ -3514,7 +3434,17 @@ int lxc_map_ids(struct lxc_list *idmap, pid_t pid)
* range by shadow. * range by shadow.
*/ */
uidmap = idmaptool_on_path_and_privileged("newuidmap", CAP_SETUID); uidmap = idmaptool_on_path_and_privileged("newuidmap", CAP_SETUID);
if (uidmap == -ENOENT)
WARN("newuidmap binary is missing");
else if (!uidmap)
WARN("newuidmap is lacking necessary privileges");
gidmap = idmaptool_on_path_and_privileged("newgidmap", CAP_SETGID); gidmap = idmaptool_on_path_and_privileged("newgidmap", CAP_SETGID);
if (gidmap == -ENOENT)
WARN("newgidmap binary is missing");
else if (!gidmap)
WARN("newgidmap is lacking necessary privileges");
if (uidmap > 0 && gidmap > 0) { if (uidmap > 0 && gidmap > 0) {
DEBUG("Functional newuidmap and newgidmap binary found."); DEBUG("Functional newuidmap and newgidmap binary found.");
use_shadow = true; use_shadow = true;
...@@ -3916,16 +3846,21 @@ int chown_mapped_root(char *path, struct lxc_conf *conf) ...@@ -3916,16 +3846,21 @@ int chown_mapped_root(char *path, struct lxc_conf *conf)
return ret; return ret;
} }
int ttys_shift_ids(struct lxc_conf *c) int lxc_ttys_shift_ids(struct lxc_conf *c)
{ {
if (lxc_list_empty(&c->id_map)) if (lxc_list_empty(&c->id_map))
return 0; return 0;
if (strcmp(c->console.name, "") !=0 && chown_mapped_root(c->console.name, c) < 0) { if (!strcmp(c->console.name, ""))
ERROR("Failed to chown %s", c->console.name); return 0;
if (chown_mapped_root(c->console.name, c) < 0) {
ERROR("failed to chown console \"%s\"", c->console.name);
return -1; return -1;
} }
TRACE("chowned console \"%s\"", c->console.name);
return 0; return 0;
} }
...@@ -4059,7 +3994,7 @@ int do_rootfs_setup(struct lxc_conf *conf, const char *name, const char *lxcpath ...@@ -4059,7 +3994,7 @@ int do_rootfs_setup(struct lxc_conf *conf, const char *name, const char *lxcpath
return -1; return -1;
} }
if (setup_rootfs(conf)) { if (lxc_setup_rootfs(conf)) {
ERROR("failed to setup rootfs for '%s'", name); ERROR("failed to setup rootfs for '%s'", name);
return -1; return -1;
} }
...@@ -4093,55 +4028,46 @@ static bool verify_start_hooks(struct lxc_conf *conf) ...@@ -4093,55 +4028,46 @@ static bool verify_start_hooks(struct lxc_conf *conf)
return true; return true;
} }
static int send_fd(int sock, int fd) static int lxc_send_ttys_to_parent(struct lxc_handler *handler)
{
int ret = lxc_abstract_unix_send_fd(sock, fd, NULL, 0);
if (ret < 0) {
SYSERROR("Error sending tty fd to parent");
return -1;
}
return 0;
}
static int send_ttys_to_parent(struct lxc_handler *handler)
{ {
int i, ret; int i;
int *ttyfds;
struct lxc_pty_info *pty_info;
struct lxc_conf *conf = handler->conf; struct lxc_conf *conf = handler->conf;
const struct lxc_tty_info *tty_info = &conf->tty_info; const struct lxc_tty_info *tty_info = &conf->tty_info;
int sock = handler->ttysock[0]; int sock = handler->ttysock[0];
int ret = -1;
size_t num_ttyfds = (2 * conf->tty);
for (i = 0; i < tty_info->nbtty; i++) { ttyfds = malloc(num_ttyfds * sizeof(int));
struct lxc_pty_info *pty_info = &tty_info->pty_info[i]; if (!ttyfds)
ret = send_fd(sock, pty_info->slave); return -1;
if (ret >= 0)
send_fd(sock, pty_info->master); for (i = 0; i < num_ttyfds; i++) {
TRACE("sending pty \"%s\" with master fd %d and slave fd %d to " pty_info = &tty_info->pty_info[i / 2];
ttyfds[i++] = pty_info->slave;
ttyfds[i] = pty_info->master;
TRACE("send pty \"%s\" with master fd %d and slave fd %d to "
"parent", "parent",
pty_info->name, pty_info->master, pty_info->slave); pty_info->name, pty_info->master, pty_info->slave);
close(pty_info->slave);
pty_info->slave = -1;
close(pty_info->master);
pty_info->master = -1;
if (ret < 0) {
ERROR("failed to send pty \"%s\" with master fd %d and "
"slave fd %d to parent : %s",
pty_info->name, pty_info->master, pty_info->slave,
strerror(errno));
goto bad;
}
} }
ret = lxc_abstract_unix_send_fds(sock, ttyfds, num_ttyfds, NULL, 0);
if (ret < 0)
ERROR("failed to send %d ttys to parent: %s", conf->tty,
strerror(errno));
else
TRACE("sent %d ttys to parent", conf->tty);
close(handler->ttysock[0]); close(handler->ttysock[0]);
close(handler->ttysock[1]); close(handler->ttysock[1]);
return 0; for (i = 0; i < num_ttyfds; i++)
close(ttyfds[i]);
bad: free(ttyfds);
ERROR("Error writing tty fd to parent");
return -1; return ret;
} }
int lxc_setup(struct lxc_handler *handler) int lxc_setup(struct lxc_handler *handler)
...@@ -4260,7 +4186,7 @@ int lxc_setup(struct lxc_handler *handler) ...@@ -4260,7 +4186,7 @@ int lxc_setup(struct lxc_handler *handler)
return -1; return -1;
} }
if (send_ttys_to_parent(handler) < 0) { if (lxc_send_ttys_to_parent(handler) < 0) {
ERROR("failure sending console info to parent"); ERROR("failure sending console info to parent");
return -1; return -1;
} }
......
...@@ -472,7 +472,7 @@ extern void lxc_restore_phys_nics_to_netns(int netnsfd, struct lxc_conf *conf); ...@@ -472,7 +472,7 @@ extern void lxc_restore_phys_nics_to_netns(int netnsfd, struct lxc_conf *conf);
extern int find_unmapped_nsid(struct lxc_conf *conf, enum idtype idtype); extern int find_unmapped_nsid(struct lxc_conf *conf, enum idtype idtype);
extern int mapped_hostid(unsigned id, struct lxc_conf *conf, enum idtype idtype); extern int mapped_hostid(unsigned id, struct lxc_conf *conf, enum idtype idtype);
extern int chown_mapped_root(char *path, struct lxc_conf *conf); extern int chown_mapped_root(char *path, struct lxc_conf *conf);
extern int ttys_shift_ids(struct lxc_conf *c); extern int lxc_ttys_shift_ids(struct lxc_conf *c);
extern int userns_exec_1(struct lxc_conf *conf, int (*fn)(void *), void *data, extern int userns_exec_1(struct lxc_conf *conf, int (*fn)(void *), void *data,
const char *fn_name); const char *fn_name);
extern int parse_mntopts(const char *mntopts, unsigned long *mntflags, extern int parse_mntopts(const char *mntopts, unsigned long *mntflags,
......
...@@ -481,7 +481,7 @@ struct lxc_handler *lxc_init(const char *name, struct lxc_conf *conf, const char ...@@ -481,7 +481,7 @@ struct lxc_handler *lxc_init(const char *name, struct lxc_conf *conf, const char
goto out_restore_sigmask; goto out_restore_sigmask;
} }
if (ttys_shift_ids(conf) < 0) { if (lxc_ttys_shift_ids(conf) < 0) {
ERROR("Failed to shift tty into container."); ERROR("Failed to shift tty into container.");
goto out_restore_sigmask; goto out_restore_sigmask;
} }
...@@ -1008,23 +1008,16 @@ static int save_phys_nics(struct lxc_conf *conf) ...@@ -1008,23 +1008,16 @@ static int save_phys_nics(struct lxc_conf *conf)
return 0; return 0;
} }
static int recv_fd(int sock, int *fd) static int lxc_recv_ttys_from_child(struct lxc_handler *handler)
{ {
if (lxc_abstract_unix_recv_fd(sock, fd, NULL, 0) < 0) { int i;
SYSERROR("Error receiving tty file descriptor from child process."); int *ttyfds;
return -1; struct lxc_pty_info *pty_info;
} int ret = -1;
if (*fd == -1)
return -1;
return 0;
}
static int recv_ttys_from_child(struct lxc_handler *handler)
{
int i, ret;
int sock = handler->ttysock[1]; int sock = handler->ttysock[1];
struct lxc_conf *conf = handler->conf; struct lxc_conf *conf = handler->conf;
struct lxc_tty_info *tty_info = &conf->tty_info; struct lxc_tty_info *tty_info = &conf->tty_info;
size_t num_ttyfds = (2 * conf->tty);
if (!conf->tty) if (!conf->tty)
return 0; return 0;
...@@ -1033,25 +1026,31 @@ static int recv_ttys_from_child(struct lxc_handler *handler) ...@@ -1033,25 +1026,31 @@ static int recv_ttys_from_child(struct lxc_handler *handler)
if (!tty_info->pty_info) if (!tty_info->pty_info)
return -1; return -1;
for (i = 0; i < conf->tty; i++) { ttyfds = malloc(num_ttyfds * sizeof(int));
struct lxc_pty_info *pty_info = &tty_info->pty_info[i]; if (!ttyfds)
return -1;
ret = lxc_abstract_unix_recv_fds(sock, ttyfds, num_ttyfds, NULL, 0);
for (i = 0; (ret >= 0 && *ttyfds != -1) && (i < num_ttyfds); i++) {
pty_info = &tty_info->pty_info[i / 2];
pty_info->busy = 0; pty_info->busy = 0;
ret = recv_fd(sock, &pty_info->slave); pty_info->slave = ttyfds[i++];
if (ret >= 0) pty_info->master = ttyfds[i];
recv_fd(sock, &pty_info->master); TRACE("received pty with master fd %d and slave fd %d from "
if (ret < 0) { "parent", pty_info->master, pty_info->slave);
ERROR("failed to receive pty with master fd %d and "
"slave fd %d from child: %s",
pty_info->master, pty_info->slave,
strerror(errno));
return -1;
}
TRACE("received pty with master fd %d and slave fd %d from child",
pty_info->master, pty_info->slave);
} }
tty_info->nbtty = conf->tty; tty_info->nbtty = conf->tty;
return 0; free(ttyfds);
if (ret < 0)
ERROR("failed to receive %d ttys from child: %s", conf->tty,
strerror(errno));
else
TRACE("received %d ttys from child", conf->tty);
return ret;
} }
void resolve_clone_flags(struct lxc_handler *handler) void resolve_clone_flags(struct lxc_handler *handler)
...@@ -1294,7 +1293,7 @@ static int lxc_spawn(struct lxc_handler *handler) ...@@ -1294,7 +1293,7 @@ static int lxc_spawn(struct lxc_handler *handler)
cgroups_connected = false; cgroups_connected = false;
/* Read tty fds allocated by child. */ /* Read tty fds allocated by child. */
if (recv_ttys_from_child(handler) < 0) { if (lxc_recv_ttys_from_child(handler) < 0) {
ERROR("Failed to receive tty info from child process."); ERROR("Failed to receive tty info from child process.");
goto out_delete_net; goto out_delete_net;
} }
......
...@@ -315,7 +315,7 @@ static int get_pty_on_host(struct lxc_container *c, struct wrapargs *wrap, int * ...@@ -315,7 +315,7 @@ static int get_pty_on_host(struct lxc_container *c, struct wrapargs *wrap, int *
conf->console.descr = &descr; conf->console.descr = &descr;
/* Shift ttys to container. */ /* Shift ttys to container. */
if (ttys_shift_ids(conf) < 0) { if (lxc_ttys_shift_ids(conf) < 0) {
ERROR("Failed to shift tty into container"); ERROR("Failed to shift tty into container");
goto err1; goto err1;
} }
......
...@@ -39,6 +39,11 @@ ...@@ -39,6 +39,11 @@
#include "initutils.h" #include "initutils.h"
/* Define __S_ISTYPE if missing from the C library. */
#ifndef __S_ISTYPE
#define __S_ISTYPE(mode, mask) (((mode)&S_IFMT) == (mask))
#endif
/* Useful macros */ /* Useful macros */
/* Maximum number for 64 bit integer is a string with 21 digits: 2^64 - 1 = 21 */ /* Maximum number for 64 bit integer is a string with 21 digits: 2^64 - 1 = 21 */
#define LXC_NUMSTRLEN64 21 #define LXC_NUMSTRLEN64 21
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment