do1****@yande***** wrote: > > There was a very long history of > > battle between pathname based access control (e.g. AppArmor) and inode based > > access control (e.g. SELinux). Since pathname based access control had > > an advantage which cannot be achieved using inode based access control, TOMOYO > > and AppArmor were able to join the mainline. (The advantage is not the ease of > > use; see http://sourceforge.jp/projects/tomoyo/docs/lfj2008-bof.pdf and > > http://sourceforge.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf for examples.) > > Great slides, thanks for them. > > In lfj2008-bof.pdf you say, that to protect /etc/shadow being read if linked to /tmp/shadow > pathname based access control needs "to restrict pathname changes", but label based > don't need to care about it. So you saying yourself that it is possible to fully control > access to /etc/shadow if we also control renaming, linking, mounting. I didn't say I can perfectly control access to /etc/shadow . I said we need to care about not only readability/writability/executability of a file but also the location of the file because whether the system works as expected depends on whether resources are available at expected location. I said name based access control can care about the location of resources better than inode based access control. My opinion is that inode based access control and name based access control play a complementary role vis-a-vis, and therefore both access controls should be used together. LSM stacking is now under discussion and we will be able to run both access controls in parallel by the end of this year. Name based access control can control access to a file only if the file to protect has uniquely identifiable pathnames. A temporary file created in /tmp/ may have random names like /tmp/w4gZ6h and multiple applications may create such temporary files. But name based access control cannot distinguish which temporary file should be accessible from which application because name has no meaning in this case. Only inode based access control which associates creator application's information can correctly distinguish such temporary files. Not all files have uniquely identifiable names. Also, the same pathname in different namespace may refer different resources. Inode based access control sometimes handles better than name based access control. > And all these things are controllable. So why you now saying perfectly protecting > using pathname based access control fails and I should understand and accept it. Only partially controllable. Not always perfectly controllable. As I said above, name based access control can protect resources only when resources have limited and meaningful names. That's reason you can't perfectly protect using pathname based access control. > I don't understand, if you saying in pdf yourself that it is possible. It just requires > more work. Plus, you then prove that this more work is in fact what is may be > required and is a good thing. Which I agree. (For example, binding /etc/ to /tmp > - we need protection against this, this is pathname based feature, and this also > solves problem of /etc/shadow being pathname manipulated.) So why it fail or not > compatible with read-only directories? It looks compatible and your words support it. > Read-only mount/directory in one namespace can have read/write mount/directory in another namespace. Pathname based access control can't do it because there is no means to distinguish namespace. > > This is impossible because of how pathnames are managed in Linux. > > Three data structures involves here: "struct inode", "struct dentry" and > > "struct vfsmount". A pathname is converted to a "struct vfsmount" and > > "struct dentry" pair. "struct inode" can be determined via > > "struct dentry"->d_inode and parent directory can be determined via > > "struct dentry"->d_parent . But a "struct vfsmount" and "struct dentry" pair > > which is needed for calculating a canonical pathname cannot be determined from > > "struct inode". Therefore, we cannot calculate a pathname from "struct inode". > > I know. But this is 'traverse' operation, so you possible have parts of > pathname (becasue it is supposed to traverse elements of pathname) not just > inode numbers. (How is Smack LSM enforce r or x on directory - so probably > LSM have some hooks into traverse mechanisms. I don't sure if it have parts > of pathname at that time, though, if not, then it is not possible of course.) No partial pathname available for traverse operation. SELinux and SMACK do not calculate pathnames for checking permissions. The hook which SELinux and SMACK use for checking traverse permission receives only "struct inode". int security_inode_permission(struct inode *inode, int mask); Since "struct dentry" and "struct vfsmount" are not passed to security_inode_permission() hook, pathname based access control (e.g. TOMOYO, AppArmor, AKARI, CaitSith) cannot use pathname when checking permission for traverse operation. Of course, you can join both LSM mailing list and FS-devel mailing list and persuade the both maintainers to pass "struct dentry" and "struct vfsmount"; good luck. > In any case you maybe able to calculate directory realpath from dentry (just > cd to '..' until / is meet, I think that's how realpath is already works.) Not true. Pathname based access control's realpath is calculated from a "struct dentry" and "struct vfsmount" pair. If we have only "struct dentry", we can calculate partial pathname only up to mount point which the "struct dentry" belongs to; in order words, we cannot calculate till / is met if "struct dentry" does not belong to / partition. > > On the contrary, we have hooks for checking permissions (which TOMOYO, > > AppArmor, AKARI, CaitSith uses) which are called after a pathname was converted > > to a "struct vfsmount" and "struct dentry" pair. We can calculate a pathname > > 喃rom these hooks but we cannot determine whether we have traversed a directory > > inode which has path.ino=1234567 path.major=8 path.minor=2 attributes or not. > > But Smack is LSM and it can restrict traverse operation. So there should be > some hooks. There is a hook for checking permission for traverse operation. But we cannot calculate pathname from that hook. SMACK can work because SMACK does not use pathnames for checking permissions, while your proposal can't be implemented because your proposal needs to calculate pathnames from that hook. > > The concept of canonical pathname does not work. > > This exampel does not say anything about canonical pathname. It try to forbid > traverse by min/maj/inode. > There are multiple routes to reach the directory containing "yourbackup" file. Try below operations. (Note the $ prompt which means non-root user.) (1) $ mkdir -m 777 -p /tmp/dir1/dir2/dir3/ (2) $ echo hello > /tmp/dir1/dir2/dir3/file (3) $ mkdir -m 777 -p /tmp/dir0/ (4) $ su - root -c "mount --bind /tmp/dir1/dir2/dir3/ /tmp/dir0/" (5) $ cd /tmp/dir1/dir2/dir3/ (6) $ chmod 000 /tmp/dir1/ (7) $ cat /tmp/dir1/dir2/dir3/file cat: /tmp/dir1/dir2/dir3/file: Permission denied (8) $ cat file hello (9) $ chmod 777 /tmp/dir1/ (10) $ chmod 000 /tmp/dir1/dir2/ (11) $ cat /tmp/dir1/dir2/dir3/file cat: /tmp/dir1/dir2/dir3/file: Permission denied (12) $ cat file hello (13) $ chmod 777 /tmp/dir1/dir2/ (14) $ chmod 000 /tmp/dir1/dir2/dir3/ (15) $ cat /tmp/dir1/dir2/dir3/file cat: /tmp/dir1/dir2/dir3/file: Permission denied (16) $ cat file cat: file: Permission denied (17) $ chmod 777 /tmp/dir1/dir2/dir3 (18) $ chmod 000 /tmp/dir1/dir2/ (19) $ cat /tmp/dir1/dir2/dir3/file cat: /tmp/dir1/dir2/dir3/file: Permission denied (20) $ cat /tmp/dir0/file hello Suppose your min/maj/inode traversal checking idea is implemented and min/maj/inode is set to attributes of /tmp/dir1/dir2/ upon the time of loading rules, you will fail to block (20) because /tmp/dir0/file can reach /tmp/dir1/dir2/dir3/file even though /tmp/dir1/dir2/ is not within /tmp/dir0/ . > > Trying to restrict based on grandparent directory's inode and/or its ascendant > > inodes above does not work because only last component's inode and its parent > > directory's inode are guaranteed to be checked, for a process might request > > pathnames relative to current directory (e.g. unlink("yourbackup") rather than > > unlink("/home/backup/year/month/day/yourbackup" when its current directory is > > /home/backup/year/month/day/ ). > > I understand that, but this does not mean this approach does not work at all. > For example, if I forbid for some min/maj/inode (directory) access altogether, then > user will be not able to chdir to any underlying pathname and then unlink("backupfile"). > So this will work. (Plus, we still be possible able to determine realpath of parent directory.) > And comparing among (8), (12), (16), you can see that only (16) failed. In other words, only permission of /tmp/dir1/dir2/dir3/ and /tmp/dir1/dir2/dir3/file are checked if current directory is already /tmp/dir1/dir2/dir3/ and requested access to ./file . Trying to reject access using attributes of /tmp/dir1/dir2/ or /tmp/dir1/ does not work. You might think we can still block if we forbid changing current directory to /tmp/dir1/dir2/dir3/ and doing bind mount operation. Yes if your system can work even if you unconditionally forbid them. But we can't do; security_inode_permission() cannot tell whether this is for chdir operation or for other operations. If you unconditionally deny traverse operation at security_inode_permission(), yourbackup becomes never reachable (i.e. even from applications which should be able to access yourbackup). > > People can access protected data using relative pathnames. This means that, > > a directory with "path.ino=1234567 path.major=8 path.minor=2" may not be > > traversed when accessing a file which is located as a descendant of the > > directory. > > People will not be able to chdir behing that directory, so it works. > > > Also, /home/ or /home/backup/year/month/day/ might be bind mounted to > > somewhere else. > > > > If you use pathnames in your rules, please understand and accept that > > the rules are not compatible with read-only mounts/directories. > > I think they are compatible, or at least step in that direction. We don't need to > achieve perfect security suddenly, but can go step by step and see if this helps. > > Best regards, > > I hope you have now understood why your min/maj/inode idea does not work.