How Socket Error Codes Depend on Runtime and Operating System

This post is the first part of a blog post series that covers different technical challenges that we had to resolve during the migration of the Rider backend process from Mono to .NET Core. By sharing our experiences, we hope to help out those who are in the same boat.
There’s too much to share in one post, so we will make this into a series of posts. In this series:

Let’s dive in!

Sockets and error codes

Rider consists of several processes that send messages to each other via sockets. To ensure the reliability of the whole application, it’s important to properly handle all the socket errors. In our codebase, we had the following code which was adopted from Mono Debugger Libs and helps us communicate with debugger processes:

In the case of a failed connection because of a “ConnectionRefused” error, we are retrying the connection attempt. It works fine with .NET Framework and Mono. However, once we migrated to .NET Core, this method no longer correctly detects the “connection refused” situation on Linux and macOS. If we open the SocketException documentation, we will learn that this class has three different properties with error codes:

  • SocketError SocketErrorCode: Gets the error code that is associated with this exception.
  • int ErrorCode: Gets the error code that is associated with this exception.
  • int NativeErrorCode: Gets the Win32 error code associated with this exception.

What’s the difference between these properties? Should we expect different values on different runtimes or different operating systems? Which one should we use in production? Why do we have problems with ShouldRetryConnection on .NET Core? Let’s figure it all out!

Digging into the problem

Let’s start with the following program, which prints error code property values for SocketError.ConnectionRefused:

If we run it on Windows, we will get the same value on .NET Framework, Mono, and .NET Core:

SocketErrorCode ErrorCode NativeErrorCode
.NET Framework 10061 10061 10061
Mono 10061 10061 10061
.NET Core 10061 10061 10061

10061 corresponds to the code of the connection refused socket error code in Windows (also known as WSAECONNREFUSED).
Now let’s run the same program on Linux:

SocketErrorCode ErrorCode NativeErrorCode
Mono 10061 10061 10061
.NET Core 10061 111 111

As you can see, Mono returns Windows-compatible error codes. The situation with .NET Core is different: it returns a Windows-compatible value for SocketErrorCode (10061) and a Linux-like value for ErrorCode and NativeErrorCode (111).
Finally, let’s check macOS:

SocketErrorCode ErrorCode NativeErrorCode
Mono 10061 10061 10061
.NET Core 10061 61 61

Here, Mono is completely Windows-compatible again, but .NET Core returns 61 for ErrorCode and NativeErrorCode.
In the IBM Knowledge Center, we can find a few more values for the connection refused error code from the Unix world (also known as ECONNREFUSED):

  • AIX: 79
  • HP-UX: 239
  • Solaris: 146

For a better understanding of what’s going on, let’s check out the source code of all the properties.

SocketErrorCode

SocketException.SocketErrorCode returns a value from the SocketError enum. The numerical values of the enum elements are the same on all the runtimes (see its implementation in .NET Framework, .NET Core 3.1.3, and Mono 6.8.0.105):

These values correspond to the Windows Sockets Error Codes.

NativeErrorCode

In .NET Framework and Mono, SocketErrorCode and NativeErrorCode always have the same values:

In .NET Core, the native code is calculated in the constructor (see SocketException.cs#L20):

The Windows implementation of GetNativeErrorForSocketError is trivial (see SocketException.Windows.cs):

The Unix implementation is more complicated (see SocketException.Unix.cs):

TryGetNativeErrorForSocketError should convert SocketError to the native Unix error code.
Unfortunately, there exists no unequivocal mapping between Windows and Unix error codes. As such, the .NET team decided to create a Dictionary that maps error codes in the best possible way (see SocketErrorPal.Unix.cs):

Once we have an instance of Interop.Error, we call interopErr.Info().RawErrno. The implementation of RawErrno can be found in Interop.Errors.cs:

Here we are jumping to the native function SystemNative_ConvertErrorPalToPlatform that maps Error to the native integer code that is defined in errno.h. You can get all the values using the errno util. Here is a typical output on Linux:

Note that errno may be not available by default in your Linux distro. For example, on Debian, you should call sudo apt-get install moreutils to get this utility.
Here is a typical output on macOS:

Hooray! We’ve finished our fascinating journey into the internals of socket error codes. Now you know where .NET is getting the native error code for each SocketException from!

ErrorCode

The ErrorCode property is the most boring one, as it always returns NativeErrorCode.
.NET Framework, Mono 6.8.0.105:

In .NET Core 3.1.3:

Writing cross-platform socket error handling

Circling back to the original method we started this post with, we rewrote ShouldRetryConnection as follows:

There was a lot of work involved in tracking down the error code to check against, but in the end, our code is much more readable now. Adding to that, this method is now also completely cross-platform, and works correctly on any runtime.

Overview of the native error codes

In some situations, you may want to have a table with native error codes on different operating systems. We can get these values with the following code snippet:

We executed this program on Windows, Linux, and macOS. Here are the aggregated results:

SocketError Windows Linux macOS
Success 0 0 0
OperationAborted 995 125 89
IOPending 997 115 36
Interrupted 10004 4 4
AccessDenied 10013 13 13
Fault 10014 14 14
InvalidArgument 10022 22 22
TooManyOpenSockets 10024 23 23
WouldBlock 10035 11 35
InProgress 10036 115 36
AlreadyInProgress 10037 114 37
NotSocket 10038 88 38
DestinationAddressRequired 10039 89 39
MessageSize 10040 90 40
ProtocolType 10041 91 41
ProtocolOption 10042 92 42
ProtocolNotSupported 10043 93 43
SocketNotSupported 10044 94 44
OperationNotSupported 10045 95 45
ProtocolFamilyNotSupported 10046 96 46
AddressFamilyNotSupported 10047 97 47
AddressAlreadyInUse 10048 98 48
AddressNotAvailable 10049 99 49
NetworkDown 10050 100 50
NetworkUnreachable 10051 101 51
NetworkReset 10052 102 52
ConnectionAborted 10053 103 53
ConnectionReset 10054 104 54
NoBufferSpaceAvailable 10055 105 55
IsConnected 10056 106 56
NotConnected 10057 107 57
Shutdown 10058 32 32
TimedOut 10060 110 60
ConnectionRefused 10061 111 61
HostDown 10064 112 64
HostUnreachable 10065 113 65
ProcessLimit 10067 10067 10067
SystemNotReady 10091 10091 10091
VersionNotSupported 10092 10092 10092
NotInitialized 10093 10093 10093
Disconnecting 10101 108 58
TypeNotFound 10109 10109 10109
HostNotFound 11001 -131073 -131073
TryAgain 11002 11 35
NoRecovery 11003 11003 11003
NoData 11004 61 96
SocketError -1 -1 -1

This table may be useful if you work with native socket error codes.

Summary

From this investigation, we’ve learned the following:

  • SocketException.SocketErrorCode returns a value from the SocketError enum. The numerical values of the enum elements always correspond to the Windows socket error codes.
  • SocketException.ErrorCode always returns SocketException.NativeErrorCode.
  • SocketException.NativeErrorCode on .NET Framework and Mono always corresponds to the Windows error codes (even if you are using Mono on Unix). On .NET Core, SocketException.NativeErrorCode equals the corresponding native error code from the current operating system.

SocketErrors-Blog
A few practical recommendations:

  • If you want to write portable code, always use SocketException.SocketErrorCode and compare it with the values of SocketError. Never use raw numerical error codes.
  • If you want to get the native error code on .NET Core (e.g., for passing to another native program), use SocketException.NativeErrorCode. Remember that different Unix-based operating systems (e.g., Linux, macOS, Solaris) have different native code sets. You can get the exact values of the native error codes by using the errno command.

References

This entry was posted in Dev Team blog and tagged , , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *