目录

使用lxcfs进行容器资源隔离

简介

通常在容器内执行一些系统监控命令(如top,free等),看到的并不是容器本身所分配的资源情况,而是容器所在节点,整体的资源情况,这是由于,容器并没有做对资源是图的隔离。这会让我们在容器内查询当前容器的真实用量带来困难,只能借助于监控来实现。

而LXCFS的出现,就是为了解决这个问题,通过proc的屏蔽,使容器看起来更像是个真正的系统。

以下是摘录自官方的简介

LXCFS is a small FUSE filesystem written with the intention of making Linux containers feel more like a virtual machine. It started as a side-project of LXC but is useable by any runtime.

参考链接:https://github.com/lxc/lxcfs

LXCFS is a simple userspace filesystem designed to work around some current limitations of the Linux kernel.

Specifically, it’s providing two main things

  • A set of files which can be bind-mounted over their /proc originals to provide CGroup-aware values.
  • A cgroupfs-like tree which is container aware.

参考链接:https://linuxcontainers.org/lxcfs/introduction/

部署

lxcfs的部署方式有很多,可以直接添加源,并使用如apt或yum等包管理器安装。也可以使用源码编译安装。

本次我们测试,参考官方文档,使用源码包编译安装。

官方完整步骤如下:

yum install fuse fuse-lib fuse-devel (官方文档这里写的应该有问题,应该是fuse-libs) git clone git://github.com/lxc/lxcfs cd lxcfs ./bootstrap.sh ./configure make make install

以下所有安装步骤均来自官方文档:https://github.com/lxc/lxcfs

软件安装

  1. 安装编译环境

    我使用的centos8系统,这里使用的都是fuse3的包,并不是fuse,不然在configure步骤会报错缺少此包,你也可以不加,等到报错后,在根据报错来确定问题

    yum install fuse3 fuse3-libs fuse3-devel

  2. 下载所需源码包

    下载地址:https://github.com/lxc/lxcfs/releases

    这里官方给的做法是直接git克隆,但是克隆后,并没有安装包所说的文件

  3. 解压并进入lxcfs目录

  4. 执行./bootstrap.sh

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    
    [root@VM-0-2-centos lxcfs-lxcfs-4.0.9]# ./bootstrap.sh
    ++ set -e
    ++ test -d autom4te.cache
    ++ libtoolize
    libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, 'config'.
    libtoolize: linking file 'config/ltmain.sh'
    libtoolize: putting macros in AC_CONFIG_MACRO_DIRS, 'm4'.
    libtoolize: linking file 'm4/libtool.m4'
    libtoolize: linking file 'm4/ltoptions.m4'
    libtoolize: linking file 'm4/ltsugar.m4'
    libtoolize: linking file 'm4/ltversion.m4'
    libtoolize: linking file 'm4/lt~obsolete.m4'
    ++ aclocal -I m4
    ++ autoheader
    ++ autoconf
    ++ automake --add-missing --copy
    configure.ac:16: installing 'config/compile'
    configure.ac:15: installing 'config/config.guess'
    configure.ac:15: installing 'config/config.sub'
    configure.ac:14: installing 'config/install-sh'
    configure.ac:14: installing 'config/missing'
    Makefile.am: installing './INSTALL'
    src/Makefile.am: installing 'config/depcomp'
    
  5. 执行./configure

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    12
    13
    14
    15
    16
    17
    18
    19
    20
    21
    22
    23
    24
    25
    26
    27
    28
    29
    30
    31
    32
    33
    34
    35
    36
    
    [root@VM-0-2-centos lxcfs-lxcfs-4.0.9]# ./configure
    checking for pkg-config... /usr/bin/pkg-config
    checking pkg-config is at least version 0.9.0... yes
    checking for a BSD-compatible install... /usr/bin/install -c
    checking whether build environment is sane... yes
    checking for a thread-safe mkdir -p... /usr/bin/mkdir -p
    checking for gawk... gawk
    checking whether make sets $(MAKE)... yes
    checking whether make supports nested variables... yes
    checking build system type... x86_64-pc-linux-gnu
    checking host system type... x86_64-pc-linux-gnu
    checking whether make supports the include directive... yes (GNU style)
    checking for gcc... gcc
    checking whether the C compiler works... yes
    checking for C compiler default output file name... a.out
    checking for suffix of executables...
    checking whether we are cross compiling... no
    checking for suffix of object files... o
    checking whether we are using the GNU C compiler... yes
    checking whether gcc accepts -g... yes
    checking for gcc option to accept ISO C89... none needed
    ...
    ...
    ...
    config.status: creating config.h
    config.status: executing depfiles commands
    config.status: executing libtool commands
    
    ----------------------------
    Environment:
     - compiler: gcc
    
    Debugging:
     - tests:
     - ASAN: no
     - mutex debugging:
    
  6. 执行make && make install

    这里的输出太长了,就不贴了,自行观察无报错即可,有报错再具体看,不同系统可能结果会不同

软件启动

官方给出的是前台运行,但这显然不符合使用喜欢,可以用nohup将其放到后台,并输出相关日志即可

为了开启实现自启动,需要添加相应的开机启动,如是通过包管理工具安装的,也要将此设置为开机自启

1
2
mkdir -p /var/lib/lxcfs
nohup lxcfs /var/lib/lxcfs >> /var/log/lxcfs.log 2>&1 &

观察日志如下

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
Running constructor lxcfs_init to reload liblxcfs
mount namespace: 4
hierarchies:
  0: fd:   5: name=systemd
  1: fd:   6: perf_event
  2: fd:   7: freezer
  3: fd:   8: net_cls,net_prio
  4: fd:   9: memory
  5: fd:  10: blkio
  6: fd:  11: cpu,cpuacct
  7: fd:  12: cpuset
  8: fd:  13: pids
  9: fd:  14: hugetlb
 10: fd:  15: devices
Kernel supports pidfds
Kernel supports swap accounting
api_extensions:
- cgroups
- sys_cpu_online
- proc_cpuinfo
- proc_diskstats
- proc_loadavg
- proc_meminfo
- proc_stat
- proc_swaps
- proc_uptime
- shared_pidns
- cpuview_daemon
- loadavg_daemon
- pidfds

隔离验证

启动工作负载

启动两个完全一样的workload,两者唯一的区别是一个未挂载任何卷,一个挂载有宿主机/var/lib/lxcfs下的内容

两者yaml如下,已省略非必要内容,容器的内存limit为1G,CPU的limit为0.5,这个在yaml中就不去体现了。

挂载hostpath,意味着使用lxcfs进行隔离:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: centos
  name: centos
spec:
  selector:
    matchLabels:
      k8s-app: centos
  template:
    metadata:
      labels:
        k8s-app: centos
    spec:
      containers:
      - command:
        - /sbin/init
        image: centos:latest
        imagePullPolicy: Always
        name: centos
        volumeMounts:
        - mountPath: /proc/cpuinfo
          name: cpuinfo
        - mountPath: /proc/diskstats
          name: diskstats
        - mountPath: /proc/meminfo
          name: meminfo
        - mountPath: /proc/stat
          name: stat
        - mountPath: /proc/swaps
          name: swaps
        - mountPath: /proc/uptime
          name: uptime
      volumes:
      - hostPath:
          path: /var/lib/lxcfs/proc/cpuinfo
          type: File
        name: cpuinfo
      - hostPath:
          path: /var/lib/lxcfs/proc/diskstats
          type: File
        name: diskstats
      - hostPath:
          path: /var/lib/lxcfs/proc/meminfo
          type: File
        name: meminfo
      - hostPath:
          path: /var/lib/lxcfs/proc/stat
          type: File
        name: stat
      - hostPath:
          path: /var/lib/lxcfs/proc/swaps
          type: File
        name: swaps
      - hostPath:
          path: /var/lib/lxcfs/proc/uptime
          type: File
        name: uptime

未挂载hostpath,即不使用lxcfs隔离:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    k8s-app: centos
  name: centos
spec:
  selector:
    matchLabels:
      k8s-app: centos
  template:
    metadata:
      labels:
        k8s-app: centos
    spec:
      containers:
      - command:
        - /sbin/init
        image: centos:latest
        imagePullPolicy: Always
        name: centos

验证

挂载hostpath的容器:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
[root@centos-6d9c568d6c-nj2vv /]# top
top - 14:13:42 up 8 min,  0 users,  load average: 0.12, 0.29, 0.20
Tasks:   3 total,   1 running,   2 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.2 us,  0.0 sy,  0.0 ni, 99.8 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   1024.0 total,   1001.9 free,      4.5 used,     17.7 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   1019.5 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1 root      20   0  100428   6744   5660 S   0.0   0.6   0:00.01 systemd
      6 root      20   0   12044   3144   2656 S   0.0   0.3   0:00.00 bash
     20 root      20   0   49136   3848   3220 R   0.0   0.4   0:00.00 top
 
 
 [root@centos-6d9c568d6c-nj2vv /]# free -m
              total        used        free      shared  buff/cache   available
Mem:           1024           4        1001           0          17        1019
Swap:             0           0           0


 [root@centos-6d9c568d6c-nj2vv /]# uptime
 14:25:27 up 0 min,  0 users,  load average: 0.31, 0.34, 0.25
 

未挂载hostpath的容器:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
[root@centos-b57f6b6c7-knbtn /]# top
top - 14:15:43 up 12 min,  0 users,  load average: 0.14, 0.23, 0.18
Tasks:   3 total,   1 running,   2 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :   3673.2 total,    941.0 free,   1033.0 used,   1699.2 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.   2439.9 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
      1 root      20   0  100428   5400   4360 S   0.0   0.1   0:00.01 systemd
      6 root      20   0   12048   3188   2744 S   0.0   0.1   0:00.00 bash
     20 root      20   0   49132   3792   3160 R   0.0   0.1   0:00.00 top
     
     
 [root@centos-b57f6b6c7-knbtn /]# free -m
              total        used        free      shared  buff/cache   available
Mem:           3673        1028         945           3        1699        2444
Swap:             0           0           0   
     
     
 [root@centos-b57f6b6c7-knbtn /]# uptime
 14:16:07 up 12 min,  0 users,  load average: 0.09, 0.21, 0.18

通过上述可以看到,未挂载hostpath的容器,所看到的信息,均为节点的信息,而并非容器真实的,但是挂载了hostpath的,看到的是容器的信息。

总结

目前lxcfs已经支持如下的试图隔离,没有一一测试,具体支持情况,还是建议在使用时参考官网,避免出现规则变化,导致无效。

/proc/cpuinfo /proc/diskstats /proc/meminfo /proc/stat /proc/swaps /proc/uptime /proc/slabinfo /sys/devices/system/cpu /sys/devices/system/cpu/online

优点:

很明显,隔离了,观察更加方便了。

缺点:

也很明显,要配置很多hostpath,如果是docker run方式启动,也要去挂载很多卷,这个是很不方便的,不知道后续,此特性是否会集成到linux内核中,默认即隔离。

遇到的问题

当lxcfs崩掉后,日志会打印Transport endpoint is not connected,并且/var/lib/lxcfs目录状态如下

1
2
3
[root@VM-0-2-centos lib]# ll /var/lib |grep lxcfs
ls: 无法访问'/var/lib/lxcfs': 传输端点尚未连接
d?????????  ? ?       ?           ?            ? lxcfs

可通过如下方式解决,然后重启lxcfs即可。

fusermount -u /var/lib/lxcfs

参考:https://github.com/lxc/lxcfs/issues/73

后来发现,如果使用bin包直接安装,使用systemd管理时,生成的lxcfs.service配置文件,有对这个问题的处理

那就使用systemd管理吧:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
cat > /etc/systemd/system/lxcfs.service << EOF
[Unit]
Description=lxcfs

[Service]
ExecStart=/usr/local/bin/lxcfs /var/lib/lxcfs/
KillMode=process
Restart=on-failure
ExecStopPost=-/bin/fusermount -u /var/lib/lxcfs
Delegate=yes

[Install]
WantedBy=multi-user.target
EOF

别忘了systemctl enable lxcfs