Podcast

PyCharm and WSL

Over the past few months, I’ve been monitoring a ticket closely. Over the course of two years, the ticket has accrued over 130 votes. It’s the one about WSL support in PyCharm, and by extension, the rest of the JetBrains IDEs. When I say it’s “the one”, it’s because this is the probably the most famous ticket with regards to WSL in our tracker. So, the question is, why is this taking so long to implement?

The History of WSL Support

As things stand right now, WSL and WSL2 are both supported on PyCharm. However, the issue is not with the support itself, but rather how it is supported. WSL is currently supported directly via wsl.exe. We initially used SSH and SFTP to run commands and transfer files. We needed to do this because this was the only way in which we could support WSL at the time.

There were multiple reasons for this. WSL showed tremendous promise for people who wanted to develop on open source technologies. However, we needed to make sure that we could adapt to changes in WSL. At the same time, we were dealing with technology that was not our own, and we needed to be careful about building support that would need to be re-done.

However, the biggest problem stems from a limitation of the IntelliJ platform at the time. IntelliJ expects that it is working with a real file system, and in the case of remote machines, you don’t have a real file system.

This is why, we have a copy of the files on your local machine, which is then uploaded via SFTP. This means that whenever you make changes, there will be delays before you can immediately run it.

However, taking a deeper look at this, we begin to see the core of the issue, and that is we need to have a way to support remote development in a better way. By remote, I mean any remote host. This means WSL, but also includes any host on a remote machine and that we would not have to build custom implementations for things like WSL from scratch. This is why, we began working on a project called “Targets”.

WSL-in-PyCharm

The Targets API

This new system provides a layer of abstraction over all remote hosts, whether it is WSL, an AWS, GCP or any other machine for that matter. Now, we use the term “remote” loosely here, because to us, a remote is anything that is not the file system or the operating system that PyCharm is running on.

This means that the way to support interpreters will also change fundamentally; it also means that there is a lot of refactoring involved.

Think of the API as a matrix. Not The Matrix, but a matrix. If you want to support a new remote, then you need to start filling out that matrix, and you need to provide answers to how the IDE will handle different scenarios. So, for example, if you wish to add direct support for Docker or WSL, you will need to fill out the entire matrix of behaviours that can be done from the IDE.

Through this approach, we can indeed pave a way for all future remote targets, but it means that the transition to this API will be gradual, as a lot of the current functionality will need to be re-written in order to take advantage of this.

This also means that when complete, cloud providers will have an easier way of adding all kinds of functionality, and editing should become as fluid as editing on the filesystem itself (or so we hope).

Progress Thus Far

Our plan is to implement the Targets API in 2021 although we’re still working through a few issues that arise from the implementation. It will implement some basic things such as docker support and remote interpreters, as the year progresses, we hope to add further support for WSL and bring it on part with all other remote targets.

Transcript

Nafiul: [00:00:00] Hello, all you beautiful PyCharmers. This is Early Access PyCharm with your host Nafiul Islam. Today I sit down with three people behind our WSL support and ask them some tough questions because a lot of people really want better support for WSL on PyCharm. So let’s get into it.

Ilya: [00:00:26] Well, we started to support WSL as a remote interpreter via SSH
because at the time it was the only way to support it.

Nafiul: [00:00:36] This is Ilya. He’s one of the people who works on the remote interpreter team, which supports WSL in PyCharm, along with Vladimir as well as, Alex .

Ilya: [00:00:47] So user had to run open SSH server inside of WSL. And connect to each and they connect to any other remote server.
And I believe a couple of years ago, we switched to a new approach. And so users can now launch the WSL processes directly. Under the hood we run WSL.exe and provide the whole path to the Python interpreter and to this script and so on. This is how it works now.

Nafiul: [00:01:19] So Vladimir, can you just tell me how this all started?
Not the WSL part, but also about remote interpreters in general.

Vladimir: [00:01:30] So it started even before we all had joined JetBrains. The oldest commits I’ve seen were made at 2012. If I’m not mistaken. So, I believe it’s time when it started.

Nafiul: [00:01:45] So is this something that came from the IntelJ platform or was this something that was made by the PyCharm team itself?

Vladimir: [00:01:51] No. As far as I am concerned initially it was made especially for PyCharm and just a few years ago it was moved to the whole platform.

Nafiul: [00:02:04] Okay. So something went out of PyCharm and became accepted in other IDEs. So that’s pretty cool. This is not something that usually happens here at JetBrains. Usually it’s IntelliJ that builds the platform. And the features just sort of end up in other IDEs.
So the question that I have is when you’re using something like WSL or say Apple comes up with a, with a fancy new mechanism for virtualization. We don’t know if that’s ever going to happen, but essentially what is preventing us from incorporating or providing native support for something like WSL from the get-go.

Ilya: [00:02:49] Well for WSL, we have a couple of problems. The first one that all IntelliJ products are initially configured to work with local files. Even if you have your project on some remote system, you still have to store your files locally and IntelliJ product will copy them to the remote server automatically.

Nafiul: [00:03:11] And how does the sync happen?

Ilya: [00:03:13] There is a special configuration called deployment and IntelliJ monitors your files, and when files are changed, they are copied to their remote server. Or in some cases they are copied before you launch your script.

Nafiul: [00:03:28] So essentially you have to copy the whole file.
You’re not changing the files themselves on the server. Like you just do a complete upload. Is that how it works?

Ilya: [00:03:37] Yes. Some products do support very limited file editing on the remote servers. As far as I know PhpStorm support, you can open one file and edit it, but the whole project should be stored on your local machine and you should use your locally installed version control and so on.

Nafiul: [00:04:00] I see. Okay. It makes sense, but explain this to me. You need to copy it back and forth, but so one of the issues that we have with WSL for example, is support for virtual environments, right? That does not seem to be limited by copying and pasting files that are being edited inside of the editor.
So what is kind of holding us back in terms of giving users that support on virtual machines or WSL or whatever.

Ilya: [00:04:31] It’s more like a historical problem. We had a very different approach to use it for a virtual environment and different interpreter types. But now we are trying to unify all this things together and want to finish this job.
We should have, like you need API, which will give us ability to create a virtual environment on any interpreter type, be it a WSL or SSH or whatever.

Sasha: [00:05:01] Yes, actually Ilya said exactly what our plan plans are, as for now. There is quite a lot of differences between the local execution and local file system and local file system actions and working with files and executing files with the remote machines.
So basically now we have two different implementations for almost each feature. Like we have some extention points that are implemented differently for local machine and SSH machines. So this, I think this holds us back for some features that we are not exposing to users for remote development, like creating virtualenvs.
But generally the plan is that we are going to provide an API that allows us to use one base code for each of the feature we provide and let this feature run on local machine as well as on SSH and even on Docker or some AWS instances and so on.

Nafiul: [00:06:12] So essentially what you’re saying is the reason we haven’t solved this problem is because we want to solve this problem, not just for WSL, but for problems like WSL in the future as well.
So that different kinds of machines, virtual, remote… whatever it is … can be supported with a minimum level of effort instead of having to build everything from scratch over and over again. Am I correct in understanding that?

Sasha: [00:06:40] Yeah, it is quite correct.

Nafiul: [00:06:43] So how difficult is this?

Sasha: [00:06:46] As we already have a lot of source code for different type of targets that we have, like local machine, SSH, Docker.
We need to bring all this together and get a single code for each of these features and hide the differences of these targets under the API implementation. So ..

Nafiul: [00:07:11] what you’re telling me is you have to change a lot of existing code, make sure that that doesn’t break, unify all of that into a framework and then support all the stuff we already support.
And then you can have WSL.

Sasha: [00:07:29] I mean, then we will have some WSL features that we don’t have now, because now we have a WSL support for project execution

Nafiul: [00:07:39] Yes, absolutely. But essentially what I’m saying is a lot of the features that we have right now will probably need to be reimplemented in order for everything to work and that we’ll probably need to be tested.

Is that what you’re telling me? Like the mother of all refactorings.
Sasha: [00:07:57] Yeah, something like that. We did a lot of refactorings for example, for SSH subsystem, I started it some time ago, I think three years ago. And then, Vladimir came to our company, joined…

Nafiul: [00:08:10] You basically made him do all the hard work. Is that what you’re saying?

Sasha: [00:08:13] Yes, he made the next iteration, actually, of the refactoring. So yeah. We’ve got a lot of refactoring tasks and because we face new problems and sometimes it requires complete, not complete, but a general rewrite of the code. Yeah.

Nafiul: [00:08:34] Okay. That’s that seems like a lot of work. So the question that I have is once this target API is done, Does that mean whenever somebody comes out with a new cloud, with a new way of doing things, with a new API, say for IBM cloud or for XYZ cloud or whatever, it will be far easier for them also to implement functionality within PyCharm.

Vladimir: [00:09:01] Yes. I believe the whole idea of targets API is to generalize infrastructure for running process, for synchronized files from some high-level syncs like virtual environments, like path interpreters and so on. So yes, we want to make a simple API that would allow various cloud companies like IBM cloud, like Amazon and so on and so on just to implement some interface about running some extra process, about synchronizing files between machines and we’ll keep all the things about virtualenv and so on away from that API.

Nafiul: [00:09:50] I see, well, thank you very much, Vova, Ilya and Alexander. Thank you for answering some very tough questions and I hope to book you again soon.

Ilya: [00:09:59] Bye!

Nafiul: [00:10:00] And thank you for listening. If you want more of these podcasts, let us know on Twitter.

image description