Incidencia #42561

MSVCRT.DLL implementation of POSIX dup2() function does not conform to POSIX.1

Abrir Fecha: 2021-06-23 07:34 Última actualización: 2021-06-27 04:38

Informador:
Propietario:
(Ninguno)
Tipo:
Estado:
Open
Componente:
Hito:
(Ninguno)
Prioridad:
5 - Medium
Gravedad:
5 - Medium
Resolución:
Ninguno
Fichero:
Ninguno
Vote
Score: 1
100.0% (1/1)
0.0% (0/1)

Details

I've been aware of this non-conformity for several years, yet it continues to trip me up, on occasion. Consider the code fragment:

  1. int fd = open( "foo.txt", O_CREAT | O_EXCL, S_IREAD | S_IWRITE );
  2. if( dup2( fd, STDOUT_FILENO ) != STDOUT_FILENO )
  3. perror( "open" );

On a POSIX.1 conforming platform, (with added support for the BSD standard S_IREAD and S_IWRITE macros), assuming that "foo.txt" is successfully opened, and bound to STDOUT_FILENO by the dup2() call, the condition dup2( fd, STDOUT_FILENO ) == STDOUT_FILENO will evaluate as TRUE, and the perror() call will not be invoked. However, on Windows, the same code, (assuming that errno is zero, immediately prior to its execution), will likely result in output (from perror(), which is invoked) similar to:
open: Success.

The reason for this disparity, given a function prototype of int dup2( int fd1, int fd2 ), is that, whereas POSIX.1 stipulates that the return value, following a successful invocation, shall be equal to fd2, the Microsoft implementation always returns zero, on success, regardless of the value of fd2.

To be fair to Microsoft, although their documentation does describe dup2() as a POSIX function (in name only), it indicates that it is a deprecated ("because [it doesn't] follow the Standard C rules for implementation-specific names" ... a rationale which would seem to not be endorsed by ISO/IEC, since dup2() is ratified as an international standard name, within ISO/IEC 9945, as it is in IEEE 1003.1-2017) alias for Microsoft's non-standard _dup2(), for which the documentation does indicate the non-conformity w.r.t. the return value stipulated by POSIX.1; however, given that this catches me out, every time I have occasion to use dup2(), and I don't think it is unreasonable to expect POSIX.1 conformity, when I call any function by its POSIX.1 name, I really would like to see this fixed in the MinGW implementation.

Ticket History (3/4 Histories)

2021-06-23 07:34 Updated by: keith
  • New Ticket "MSVCRT.DLL implementation of POSIX dup2() function does not conform to POSIX.1" created
2021-06-26 02:47 Updated by: keith
Comentario

A reasonably simple work-around would be to provide an in-line implementation of dup2():

  1. #define dup2 __mingw_posix_dup2
  2. __CRT_ALIAS __cdecl __MINGW_NOTHROW int dup2 (int __fd1, int __fd2)
  3. { return ((__fd1 = _dup2( __fd1, __fd2 )) == 0) ? __fd2 : __fd1; }
which would override the declaration in <io.h>, (provided this is provided after <io.h> has been included).

There are several options, (not exhaustive), for provision of such a work-around:

  1. Do nothing; leave it to the user's discretion, if and when to implement such a work-around, in their own code. That may be advantageous to me, as mingwrt maintainer, but is hardly a convenient solution for end users.
  2. Add the work-around within <io.h>, as an unconditional replacement for the existing dup2() function declaration. This would ensure POSIX.1 conformity, in any use of the function, when referred to by its POSIX.1 name, but may come as a surprise to any user who expects the original Microsoft behaviour, in spite of using the POSIX.1 function name; (this may not be a significant concern, since such usage cases really should be using Microsoft's preferred, ugly, non-standard, non-portable alternative name, _dup2(), rather than the POSIX.1 name in a non-conforming context).
  3. Add the work-around within <io.h>, as for option 2 above, but make replacement of the existing dup2() declaration conditional on some feature test — either a new one, created specifically for this purpose, or on #if __UNISTD_H_SOURCED__, (which is already in scope, when <io.h> is included by <unistd.h>). I don't like this, as a potential solution, because, on the one hand, it would require definition of yet another non-standard feature test macro — and we already have more than enough of those — while on the other hand, it would make the work-around sensitive to header inclusion order; any inclusion of <io.h>, (either directly, or possibly indirectly through another header, of which there are several), before including <unistd.h>, would (possibly unexpectedly) disable the work-around.
  4. Add the work-around within <unistd.h>, after this has included <io.h> itself. This has the advantage that any code which includes <unistd.h>, (as might be expected of any POSIX.1 conformant user of the dup2() function), will automatically activate the work-around, (regardless of any prior inclusion of <io.h>). If we also make it conditional, on absence of a prior definition of dup2, (as a macro):
    1. #ifndef dup2
    2. #define dup2 __mingw_posix_dup2
    3. __CRT_ALIAS __cdecl __MINGW_NOTHROW int dup2 (int __fd1, int __fd2)
    4. { return ((__fd1 = _dup2( __fd1, __fd2 )) == 0) ? __fd2 : __fd1; }
    5. #endif
    the the user has the option to disable the work-around, by defining:
    1. #define dup2 dup2
    before including <unistd.h>, (or any other header which might do so).

Of these options, the last is my preferred choice, with the second as a possible alternative; (note that, in the case of the second option, the existing declaration of dup2() could simply be deleted from <io.h>, and the definition of dup2(), as a macro, would become unnecessary).

2021-06-26 15:44 Updated by: eliz
Comentario

Reply To keith

A reasonably simple work-around would be to provide an in-line implementation of dup2(): {{{ code c #define dup2 mingw_posix_dup2 CRT_ALIAS cdecl MINGW_NOTHROW int dup2 (int fd1, int fd2) { return ((fd1 = _dup2( fd1, fd2 )) == 0) ? fd2 : fd1; } }}} which would override the declaration in <io.h>, (provided this is provided after <io.h> has been included). There are several options, (not exhaustive), for provision of such a work-around: 0. Do nothing; leave it to the user's discretion, if and when to implement such a work-around, in their own code. That may be advantageous to me, as mingwrt maintainer, but is hardly a convenient solution for end users. 0. Add the work-around within <io.h>, as an unconditional replacement for the existing dup2() function declaration. This would ensure POSIX.1 conformity, in any use of the function, when referred to by its POSIX.1 name, but may come as a surprise to any user who expects the original Microsoft behaviour, in spite of using the POSIX.1 function name; (this may not be a significant concern, since such usage cases really should be using Microsoft's preferred, ugly, non-standard, non-portable alternative name, _dup2(), rather than the POSIX.1 name in a non-conforming context). 0. Add the work-around within <io.h>, as for option 2 above, but make replacement of the existing dup2() declaration conditional on some feature test — either a new one, created specifically for this purpose, or on #if __UNISTD_H_SOURCED__, (which is already in scope, when <io.h> is included by <unistd.h>). I don't like this, as a potential solution, because, on the one hand, it would require definition of yet another non-standard feature test macro — and we already have more than enough of those — while on the other hand, it would make the work-around sensitive to header inclusion order; any inclusion of <io.h>, (either directly, or possibly indirectly through another header, of which there are several), before including <unistd.h>, would (possibly unexpectedly) disable the work-around. 0. Add the work-around within <unistd.h>, after this has included <io.h> itself. This has the advantage that any code which includes <unistd.h>, (as might be expected of any POSIX.1 conformant user of the dup2() function), will automatically activate the work-around, (regardless of any prior inclusion of <io.h>). If we also make it conditional, on absence of a prior definition of dup2, (as a macro): {{{ code c #ifndef dup2 #define dup2 mingw_posix_dup2 CRT_ALIAS cdecl MINGW_NOTHROW int dup2 (int fd1, int fd2) { return ((fd1 = _dup2( fd1, fd2 )) == 0) ? fd2 : fd1; } #endif }}} the the user has the option to disable the work-around, by defining: {{{ code c #define dup2 dup2 }}} before including <unistd.h>, (or any other header which might do so). Of these options, the last is my preferred choice, with the second as a possible alternative; (note that, in the case of the second option, the existing declaration of dup2() could simply be deleted from <io.h>, and the definition of dup2(), as a macro, would become unnecessary).

I agree with your preference.

However, please note that AFAIK there's one more non-conformance between the MS dup2 and Posix: Posix mandates that "If the close operation fails to close fildes2, dup2() shall return -1 without changing the open file description to which fildes2 refers." The MS implementation doesn't do that: it leaves the original fildes2s file open, and changes fildes2 to refer to the same file as fildes descriptor. If you want to fix that issue as well, a simple wrapper will no longer be sufficient.

2021-06-27 04:38 Updated by: keith
Comentario

Reply To eliz

... please note that AFAIK there's one more non-conformance between the MS dup2 and Posix: Posix mandates that "If the close operation fails to close fildes2, dup2() shall return -1 without changing the open file description to which fildes2 refers." The MS implementation doesn't do that: it leaves the original fildes2s file open, and changes fildes2 to refer to the same file as fildes descriptor. If you want to fix that issue as well, a simple wrapper will no longer be sufficient.

Thanks, Eli. I must be missing something, but I'm struggling to get my head around (even the concept of) the scenario which you describe. Do you have an example, or can you create one, to illustrate the effect, please?

FWIW, Microsoft's documentation makes no mention of any failure condition related to failure to close a pre-existing file descriptor association for fd2; it simply, and unequivocally, states that if any such association exists, it is closed, and that the return value is zero if the function succeeds, but if any (vaguely specified) error occurs, the return value will be -1. If the implementation behaves as Microsoft themselves document it, then failure to close fd2 should result in a return of -1, and thus matches the POSIX requirement. How could it be otherwise? If the function succeeds, with fd2 reassigned after failure to close the original association, what happens to the original file stream? Does the underlying OS file handle remain open, as an orphan? What happens to any _fileno() association within an associated FILE * structure?

Attachment File List

No attachments

Editar

Please login to add comment to this ticket » Entrar