0%

Systemd自动Unmount机制分析


作者: 耗子007


遇到过systemd会自动unmount一些目录,导致异常。那么systemd为什么会出现autounmount的情况呢?
这里进行简单的分析一下。

注:该异常的systemd版本为systemd-219-19.el7.x86_64

异常必现的方式

1
2
3
4
5
6
7
[root@lin ~]# mount -t ramfs /dev/nonexistent /hello/kitty
[root@lin ~]# echo $?
0
[root@lin ~]# mount | grep /hello/kitty
[root@lin ~]# umount /hello/kitty
umount: /hello/kitty: not mounted
[root@lin ~]# rmdir /hello/kitty

这里的/dev/nonexistent表示该设备不存在,注意这里必现是/dev目录下的才能触发该异常。
查看/var/log/message会发现日志如下:

1
2
3
Jun  1 11:07:44 ws systemd: Unit hello-kitty.mount is bound to inactive unit dev-littlecat.device. Stopping, too.
Jun 1 11:07:44 ws systemd: Unmounting /hello/kitty...
Jun 1 11:07:44 ws systemd: Unmounted /hello/kitty.

参考文档:

监听mountinfo

监听mountinfo调用流程

1
2
3
4
5
6
src/core/main.c  main
--> src/core/manager.c manager_startup
--> src/core/manager.c manager_enumerate
--> src/core/mount.c mount_enumerate
--> src/libsystemd/sd-event/sd-event.c sd_event_add_io
--> /src/libsystemd/sd-event/sd-event.c source_io_register

注:manager_enumerate会加载所有的units,执行enumerate操作,由于mount的unit对应的是mount_enumerate。
因此,会调用mount_enumerate函数。

mount_enumerate中注册的调用如下:

1
2
3
4
5
6
7
8
9
10
11
sd_event_add_io(m->event, &m->mount_event_source, 
fileno(m->proc_self_mountinfo),
EPOLLPRI,
mount_dispatch_io,
m);

sd_event_add_io(m->event, &m->mount_utab_event_source,
m->utab_inotify_fd,
EPOLLIN,
mount_dispatch_io,
m);

需要注意:

  • fileno(m-proc_self_mountinfo),这个就是获取文件“/proc/self/mountinfo”的句柄。
  • EPOLLPRI,是epoll机制使用的参数,表示对应的文件描述符有紧急的数据可读(这里应该表示有带外数据到来)
  • mount_dispatch_io,表示接收到事件时,触发的回调处理函数。
  • m->utab_inotify_fd,对应于文件“/run/mount”
  • EPOLLIN,是epoll机制使用的参数,表示有可读数据。

sd_event_add_io函数事件调用的是source_io_register函数进行注册,它基于epoll机制实现。

1
2
3
4
5
6
7
//source_io_register函数实现
......
if (s->io.registered)
r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_MOD, s->io.fd, &ev);
else
r = epoll_ctl(s->event->epoll_fd, EPOLL_CTL_ADD, s->io.fd, &ev);
......

如果event已经注册,这通过EPOLL_CTL_MOD入参,进行更新,否则增加该event的监听。

这一串的调用,其实就是注册监听文件“/proc/self/mountinfo”或者“/run/mount”,当该文件有数据可读时,会触发回调函数mount_dispatch_io。

回调函数mount_dispatch_io

发现”/proc/self/mountinfo”有新的mount,添加mount unit的流程以及添加需要umount依赖的流程:

1
2
3
4
5
-->  src/core/mount.c  mount_load_proc_self_mountinfo
--> src/core/mount.c mount_setup_unit
--> src/core/mount.c unit_new
--> src/core/mount.c should_umount
--> src/core/mount.c unit_add_dependency_by_name -- UNIT_CONFLICTS -- SPECIAL_UMOUNT_TARGET

发现设备状态变化,触发unmount的调用流程:

1
2
3
4
5
6
7
-->  src/core/device.c  device_found_node
--> src/core/device.c device_update_found_by_name
--> src/core/device.c device_update_found_one
--> src/core/device.c device_set_state
--> src/core/unit.c unit_notify
--> src/core/job.c job_finish_and_invalidate -- JOB_STOP -- UNIT_CONFLICTED_BY
--> src/core/job.c job_finish_and_invalidate

修复PATCH分析

  • PATCH的unmount标准:识别出非mounted对应的what,并且识别just_mounted和just_changed的what。用于触发umount流程时,判断需要umount那些mount。
  • 未打该PATCH之前的标准:所有非mounted的而且what不为空的mount,都会触发unmount流程。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
if (!mount->is_mounted) {

+ /* A mount point is gone */
+
mount->from_proc_self_mountinfo = false;

switch (mount->state) {
@@ -1710,13 +1715,17 @@ static int mount_dispatch_io(sd_event_source *source, int fd, uint32_t revents,
break;
}

- if (mount->parameters_proc_self_mountinfo.what)
- (void) device_found_node(m, mount->parameters_proc_self_mountinfo.what, false, DEVICE_FOUND_MOUNT, true);
+ /* Remember that this device might just have disappeared */
+ if (mount->parameters_proc_self_mountinfo.what) {

+ if (set_ensure_allocated(&gone, &string_hash_ops) < 0 ||
+ set_put(gone, mount->parameters_proc_self_mountinfo.what) < 0)
+ log_oom(); /* we don't care too much about OOM here... */
+ }

} else if (mount->just_mounted || mount->just_changed) {

- /* New or changed mount entry */
+ /* A mount point was added or changed */

switch (mount->state) {

@@ -1741,12 +1750,27 @@ static int mount_dispatch_io(sd_event_source *source, int fd, uint32_t revents,
mount_set_state(mount, mount->state);
break;
}
+
+ if (mount->parameters_proc_self_mountinfo.what) {
+
+ if (set_ensure_allocated(&around, &string_hash_ops) < 0 ||
+ set_put(around, mount->parameters_proc_self_mountinfo.what) < 0)
+ log_oom();
+ }
}

触发不在around中的device的Unmount流程:

1
2
3
4
5
6
7
+        SET_FOREACH(what, gone, i) {
+ if (set_contains(around, what))
+ continue;
+
+ /* Let the device units know that the device is no longer mounted */
+ (void) device_found_node(m, what, false, DEVICE_FOUND_MOUNT, true);
+ }

注:what其实就是device