Commits · 2b89a9c19db30894e2476a5a750c443dee339d70 · Chen Yisong / lxc

19 Aug, 2013 8 commits

Add missing sys/select.h include for fd_set · 2b89a9c1

authored Aug 16, 2013

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

2b89a9c1

Add missing syscall.h include to utils.h · ec346ea1

authored Aug 16, 2013

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

ec346ea1

Add arm defines for __NR_signalfd(4) · 180edd67

authored Aug 16, 2013

Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

180edd67

Android now uses a sane clone() definition · 590ae889

authored Aug 16, 2013

The current Android NDK provides a clone() defintion that's identical to
eglibc's so we can drop the ifdef from that one.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

590ae889

Define BLKGETSIZE64 and LO_FLAGS_AUTOCLEAR · bff13ba2

authored Aug 16, 2013

Those two aren't always around (specifically on bionic), so add some
defines in case they aren't already defined.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

bff13ba2

Export the local getmntent_r implementation · 92adc3e9

authored Aug 16, 2013

New code now uses getmntent_r so we need it exported so that it can be
used when building on bionic.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

92adc3e9

Replace all calls to rindex by strrchr · c32981c3

authored Aug 16, 2013

The two functions are identical but strrchr also works on Bionic.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

c32981c3

Add a local implementation of ifaddrs.h · 4ba0d9af

authored Aug 16, 2013

This adds a local ifaddrs implementation to be used on Bionic or other C
libraries that don't come with a getifaddrs implementation.

This code was written by Kenneth MacKay and is under a two-clause BSD
license (copyright information in the file headers).
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

4ba0d9af

16 Aug, 2013 5 commits

ubuntu-cloud-prep: patch /sbin/start for overlayfs · d24d56d7

authored Aug 16, 2013

upstart depends on inotify, and overlayfs does not support inotify.

That means that the following results in 'tgt' not running. tgt is simply
used here as an example of a service that installs an upstart job and
starts it on package install.
 lxc-clone -s -B overlayfs -o source-precise-amd64 -n test1
 lxc-start -n test1
 ..
 apt-get install tgt

The change here is to modify /sbin/start inside the container so that when
something explicitly tries 'start', it results in an explicit call to
'initctl reload-configuration' so that upstart is aware of the newly
placed job.

Should overlayfs ever gain inotify support, this should still not cause
any harm.
Signed-off-by: Scott Moser <smoser@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

d24d56d7

lxc-clone: default to overlaysf for -s clone of dir · e3fdf5cc

authored Aug 16, 2013

If you go to the trouble to request a -s (snapshot) clone of
a container which is dir backingstore, then you deserve an
overlayfs clone.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

e3fdf5cc

cgroup.c: remove spurious ERROR messages · 6fe93aa1

authored Aug 16, 2013

Because they are in probing functions.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>

6fe93aa1

Replace a few more str(n)dupa by str(n)dup + free · d74325c4

authored Aug 16, 2013

strdup and strndup still don't exist on bionic, so we need to do the
alloc() call ourselves or free the memory by hand.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>

d74325c4

Add attach_options.h to the list of included files · 1d374b97

authored Aug 16, 2013

Without this, make dist doesn't include it and LXC fails to build.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>

1d374b97

15 Aug, 2013 5 commits

document new lxc-create btrfs behavior · fbbf5192
Serge Hallyn authored Aug 15, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
fbbf5192

bdev: support -B best and -B lvm,dir · d44e88c2

authored Aug 15, 2013

-B dev will check whether btrfs, zfs, or lvm can be used,
in that order, and fall back to dir.

-B lvm,btrfs will try lvm first, then btrfs, then fail.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

d44e88c2

bdev_create: don't default to btrfs if possible · d3060bd0

authored Aug 15, 2013

Ideally it would be great to default to a btrfs subvolume for each new
container created.  However, this is not as we previously thought
without consequence.  'rsync --one-file-system' will not descend into
btrfs subvolumes.  This means that 'lxc-create -B _unset' will cause
different behavior for rsync -vax /var/lib/lxc based on whether that
fs is btrfs or not.

So don't do that.  If -B is not specified, use -B dir.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

d3060bd0

Add subdir-objects option to AM_INIT_AUTOMAKE · d007f8ab

authored Aug 15, 2013

Fix build with automake 1.14 and newer, since it requires explicit
setting now.
Signed-off-by: Alexander Vladimirov <alexander.idkfa.vladimirov@gmail.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

d007f8ab

lxc-fedrora: New patch for systemd detection and init configuration. · bf7d3153

authored Aug 15, 2013

Satoshi Matsumoto certainly had the right idea and in spotting a bug in
the lxc-fedora template for systemd detection.  Heart was in the right
spot but patch was not what we needed.

I've looked the patch code over for systemd support and init/upstart
support and modified the logic appropriately.  If /etc/systemd/system
exists, we'll do the right thing by systemd.  If /etc/rc.sysinit exists,
we'll do the right thing by init / upstart.  If both are installed,
we'll trying and accommodate both in case someone is playing games with
the two (I've done this).

Patch was trivial, just took more time to actually test it and create
some containers with it and verify them, than it did to code them.
Signed-off-by: Michael H. Warfield <mhw@WittsEnd.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

bf7d3153

14 Aug, 2013 14 commits

attach: implement remaining options of lxc_attach_set_environment · 3d5e9f48

authored Aug 13, 2013

This patch implements the extra_env and extra_keep options of
lxc_attach_set_environment.

The Python implementation, the C container API and the lxc-attach
utility are able to utilize this feature; lxc-attach has gained two new
command line options for this.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

3d5e9f48

python: add attach support · d7a09c63

authored May 21, 2013

Add methods attach() and attach_wait() to the Python API that give
access to the attach functionality of LXC. Both accept two main
arguments:

1. run: A python function that is executed inside the container
2. payload: (optional) A parameter that will be passed to the python
            function

Additionally, the following keyword arguments are supported:

attach_flags: How attach should operate, i.e. whether to attach to
              cgroups, whether to drop capabilities, etc. The following
              constants are defined as part of the lxc module that may
              be OR'd together for this option:
                LXC_ATTACH_MOVE_TO_CGROUP
                LXC_ATTACH_DROP_CAPABILITIES
                LXC_ATTACH_SET_PERSONALITY
                LXC_ATTACH_APPARMOR
                LXC_ATTACH_REMOUNT_PROC_SYS
                LXC_ATTACH_DEFAULT
namespaces: Which namespaces to attach to, as defined as the flags that
            may be passed to the clone(2) system call. Note: maybe we
            should export these flags too.
personality: The personality of the process, it will be passed to the
             personality(2) syscall. Note: maybe we should provide
             access to the function that converts arch into
             personality.
initial_cwd: The initial working directory after attaching.
uid: The user id after attaching.
gid: The group id after attaching.
env_policy: The environment policy, may be one of:
              LXC_ATTACH_KEEP_ENV
              LXC_ATTACH_CLEAR_ENV
extra_env_vars: A list (or tuple) of environment variables (in the form
                KEY=VALUE) that should be set once attach has
                succeeded.
extra_keep_env: A list (or tuple) of names of environment variables
                that should be kept regardless of policy.
stdin: A file/socket/... object that should be used as stdin for the
       attached process. (If not a standard Python object, it has to
       implemented the fileno() method and provide a fd as the result.)
stdout, stderr: See stdin.

attach() returns the PID of the attached process, or -1 on failure.

attach_wait() returns the return code of the attached process after
that has finished executing, or -1 on failure. Note that if the exit
status of the process is 255, -1 will also be returned, since attach
failures result in an exit code of 255.

Two default run functions are also provided in the lxc module:

attach_run_command: Runs the specified command
attach_run_shell: Runs a shell in the container

Examples (assumeing c is a Container object):

c.attach_wait(lxc.attach_run_command, 'id')
c.attach_wait(lxc.attach_run_shell)
def foo():
  print("Hello World")
  # the following line is important, otherwise the exit code of
  # the attached program will be -1
  # sys.exit(0) will also work
  return 0
c.attach_wait(foo)
c.attach_wait(lxc.attach_run_command, ['cat', '/proc/self/cgroup'])
c.attach_wait(lxc.attach_run_command, ['cat', '/proc/self/cgroup'],
              attach_flags=(lxc.LXC_ATTACH_DEFAULT &
              ~lxc.LXC_ATTACH_MOVE_TO_CGROUP))

Note that while it is possible to execute Python code inside the
container by passing a function (see example), it is unwise to import
modules, since there is no guarantee that the Python installation
inside the container is in any way compatible with that outside of it.
If you want to run Python code directly, please import all modules
before attaching and only use them within the container.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

d7a09c63

python: improve convert_tuple_to_char_pointer_array · b7f2846a

authored Aug 13, 2013

convert_tuple_to_char_pointer_array now also accepts lists and not only
tuples when converting to a C array. Other fixes:

 - some checking that it's actually a list/tuple before trying to
   convert
 - off-by-a-few-bytes allocation error
   (sizeof(char *)*n+1 vs. sizeof(char *)*(n+1)/calloc(...))
Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

b7f2846a

apparmor/attach: make sure buffer is NUL-terminated · 626ad11b

authored Aug 13, 2013

Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

626ad11b

Add attach support to container C API · a0e93eeb

authored May 21, 2013

Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

a0e93eeb

Add helper functions to convert va_list of char* to char**. · 61a1d519
Christian Seiler authored May 21, 2013
```
Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>
```
61a1d519

lxc-attach: Completely rework lxc-attach and move to API function · 9c4693b8

authored May 08, 2013

 - Move attach functionality to a completely new API function for
   attaching to containers. The API functions accepts the name of the
   container, the lxcpath, a structure indicating options for attaching
   and returns the pid of the attached process. The calling thread may
   then use waitpid() or similar to wait for the attached process to
   finish. lxc-attach itself is just a simple wrapper around the new
   API function.

 - Use CLONE_PARENT when creating the attached process from the
   intermediate process. This allows the intermediate process to exit
   immediately after attach and the original thread may supervise the
   attached process directly.

 - Since the intermediate process exits quickly, its only job is to
   send the original process the pid of the attached process (as seen
   from outside the pidns) and exit. This allows us to simplify the
   synchronisation logic by quite a bit.

 - Use O_CLOEXEC / SOCK_CLOEXEC on (hopefully) all FDs opened in the
   main thread by the attach logic so that other threads of the same
   program may safely fork+exec off. Also, use shutdown() on the
   synchronisation socket, so that if another thread forks off without
   exec'ing, the synchronisation will not fail. (Not tested whether
   this solves this issue.)

 - Instead of directly specifying a program to execute on the API
   level, one specifies a callback function and a payload. This allows
   code using the API to execute a custom function directly inside the
   container without having to execute a program. Two default callbacks
   are provided directly, one to execute an arbitrary program, another
   to execute a shell. The lxc-attach utility will always use either
   one of these default callbacks.

 - More fine-grained control of the attached process on the API level
   (not implemented in lxc-attach utility yet, some may not be sensible):
     * Specify which file descriptors should be stdin/stdout/stderr of
       the newly created process. If fds other than 0/1/2 are
       specified, they will be dup'd in the attached process (and the
       originals closed). This allows e.g. threaded applications to
       specify pipes for communication with the attached process
       without having to modify its own stdin/stdout/stderr before
       running lxc-attach.
     * Specify user and group id for the newly attached process.
     * Specify initial working directory for the newly attached
       process.
     * Fine-grained control on whether to do any, all or none of the
       following: move attached process into the container's init's
       cgroup, drop capabilities of the process, set the processes's
       personality, load the proper apparmor profile and (for partial
       attaches to any but not mount-namespaces) whether to unshare the
       mount namespace and remount /sys and /proc. If additional
       features (SELinux policy, SMACK policy, ...) are implemented,
       flags for those may also be provided.
Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

9c4693b8

Fix return type of read/write utility functions. · 650468bb

authored May 21, 2013

Signed-off-by: Christian Seiler <christian@iwakd.de>
Acked-by: Serge E. Hallyn <serge.hallyn@ubuntu.com>

650468bb

lxc-stop: exit with 1 or 2, not -1 or -2. · b93aac46
Serge Hallyn authored Aug 14, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
b93aac46
lxc_destroy: print an error if the container is not defined. · 01e6b714
Serge Hallyn authored Aug 14, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
01e6b714

cgroups: rework to handle nested containers with multiple and partial mounts · b98f7d6e

authored Aug 09, 2013

Currently, if you create a container and use the mountcgruop hook,
you get the /lxc/c1/c1.real cgroup mounted to /.  If you then try
to start containers inside that container, lxc can get confused.
This patch addresses that, by accepting that the cgroup as found
in /proc/self/cgroup can be partially hidden by bind mounts.

In this patch:

Add optional 'lxc.cgroup.use' to /etc/lxc/lxc.conf to specify which
mounted cgroup filesystems lxc should use.  So far only the cgroup
creation respects this.

Keep separate cgroup information for each cgroup mountpoint.  So if
the caller is in devices cgroup /a but cpuset cgroup /b that should
now be ok.

Change how we decide whether to ignore failure to set devices cgroup
settings.  Actually look to see if our current cgroup already has the
settings.  If not, add them.

Finally, the real reason for this patch: in a nested container,
/proc/self/cgroup says nothing about where under /sys/fs/cgroup you
might find yourself.  Handle this by searching for our pid in tasks
files, and keep that info in the cgroup handler.

Also remove all strdupa from cgroup.c (not android-friendly).
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

b98f7d6e

lxc-user-nic: specify config and db files in autoconf · 070a4b8e
Serge Hallyn authored Aug 09, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
070a4b8e

add lxc-user-nic · 20ab58c7

authored Aug 09, 2013

It is meant to be run setuid-root to allow unprivileged users to
tunnel veths from a host bridge to their containers.  The program
looks at /etc/lxc/lxc-usernet which has entries of the form

	user type bridge number

The type currently must be veth.  Whenver lxc-user-nic creates a
nic for a user, it records it in /var/lib/lxc/nics (better location
is needed).  That way when a container dies lxc-user-nic can cull
the dead nic from the list.

The -DISTEST allows lxc-user-nic to be compiled so that it uses
files under /tmp and doesn't actually create the nic, so that
unprivileged users can compile and test the code.  lxc-test-usernic
is a script which runs a few tests using lxc-usernic-test, which
is a version of lxc-user-nic compiled with -DISTEST.

The next step, after issues with this code are raised and addressed,
is to have lxc-start, when running unprivileged, call out to
lxc-user-nic (will have to exec so that setuid-root is honored).
On top of my previous unprivileged-creation patchset, that should
allow unprivileged users to create and start useful containers.

Also update .gitignore.
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

20ab58c7

hooks/Makefile.am: add ubuntu-cloud-prep · 3fb18be9
Serge Hallyn authored Aug 14, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
3fb18be9

13 Aug, 2013 2 commits
- lxc.conf.sgml.in: note the arguments and environment variables passed to hooks · baece282
  Serge Hallyn authored Aug 13, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
  baece282
- mountcgroups: use the right configuration file! · 8bb17b77
  Serge Hallyn authored Aug 13, 2013
```
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
```
  8bb17b77
12 Aug, 2013 2 commits

ubuntu-cloud-prep: cleanup, fix bug with userdata · 79159a86

authored Aug 10, 2013

--userdata was broken, completely missing an implementation.
This adds that implementation back in, makes 'debug' logic
correct, and then also improves the doc at the top.
Signed-off-by: Scott Moser <smoser@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

79159a86

lxc-destroy: Fix regular expression for getting rootfs · 034a0159

authored Aug 12, 2013

The `lxc-destroy` script was using a simple `grep` for extracting
`lxc.rootfs` from the lxc config. This regex also matches commented lines
and breaks at least removing btrfs subvolumes if the string `lxc.rootfs`
is mentioned in a comment. Furthermore, due to the unescaped dot in the
regex it would also match other wrong strings like `lxc rootfs`.

This patch modifies the regular expression to correctly match the beginning
of the line plus potential whitespace characters and the string
`lxc.rootfs`.
Signed-off-by: Franz Pletz <fpletz@fnordicwalking.de>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

034a0159

09 Aug, 2013 4 commits

ubuntu-cloud-prep: fix bad declare of VERBOSITY · 54e339f9

authored Aug 09, 2013

Signed-off-by: Scott Moser <smoser@ubuntu.com>
Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>

54e339f9

add a clone hook for ubuntu-cloud images · 65d8ae9c

authored Aug 08, 2013

This allows ability to now specify '--userdata' arguments to 'create' or
to 'clone'. So now, the following means very fast start of instances with
different user-data.

$ sudo lxc-create -t ubuntu-cloud -n precise -- \
   -r precise --arch amd64

$ sudo lxc-clone -B overlayfs -o precise -s -n ephem1 \
   --userdata="my.userdata1"
$ sudo lxc-clone -B overlayfs -o precise -s -n ephem2 \
   --userdata="my.userdata2"

Also present here is
 * an improvement to the static list of Ubuntu releases. It uses
   ubuntu-distro-info if available degrades back to a static list on failure.
 * moving of the replacement variables to the top of the create template This
   is just to make it more obvious what is being replaced and put them in a
   single location.
Signed-off-by: Scott Moser <smoser@ubuntu.com>

65d8ae9c

Cleanup Makefile.am · 1c8e4ee0

authored Aug 09, 2013

Remove some dead code and fix identation, no functional change.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>

1c8e4ee0

Replace mktemp() by a new mkifname() · 4a0ba80d

authored Aug 09, 2013

Using mktemp() leads to build time warnings and isn't actually
appropriate for what we want to do as it's checking for the existence of
a file and not a network interface.

Replace those calls by an equivalent mkifname() function which uses the
same template as mktemp but instead checks for existing network
interfaces.
Signed-off-by: Stéphane Graber <stgraber@ubuntu.com>
Acked-by: Serge Hallyn <serge.hallyn@ubuntu.com>

4a0ba80d