Once a Developer, Always a Developer
Alina Komissarova, Coordinator of JetBrains Educational Projects who is based in Novosibirsk, Russia, interviewed her fellow Siberian Tagir Valeev, Java Tech Lead in IntelliJ IDEA. They talked about what life and work are like for someone who feels the never-ceasing drive to develop, share his knowledge, and learn Japanese.
Tagir Valeev, Java Tech Lead in IntelliJ IDEA at JetBrains
Let’s start at the beginning, or maybe even a little earlier than that. What did you do before JetBrains?
I got my first job in 2004 – that’s when I began to program for money and not just for a good cause. I was finishing my Master’s degree at the Physics department of Novosibirsk State University, where I majored in Computer Science and Physics. The program was supposed to prepare us for jobs in the automation of colliders, subatomic particle detectors, and other tech like that. Some of my fellow students are now doing that sort of thing at CERN. But I was there to learn programming, not physics. A friend I played bridge with told me about a bioinformatics firm that was looking for C++ coders, and I got hired. The firm had ties with the A. P. Ershov Institute of Informatics Systems, so later I decided to go to graduate school there and get a PhD.
What was your thesis about?
I was building a gene regulation model. A human has about 20 thousand genes, roughly speaking, which are all the same in every cell and never change throughout the person’s lifetime. Though, obviously, an eye is different from a liver, a sick cell is different from a healthy one, and so on. The point is that our genes work in different ways; some are activated and some are deactivated. Gene regulation is this activation and deactivation process. Genes have signatures in their promoter region, such as binding sites. You could say they’re like file signatures. So, if a certain set of proteins all gather in one cell and they match together to produce that signature, the gene is activated and starts working. This isn’t discrete, like “yes or no”, though. A gene might produce protein very slowly, or vice versa, it might crank out a whole lot of protein. These processes are different in each cell, and sometimes they depend on what is happening to the individual.
Research in this area had progressed quickly shortly before I went to university. Other researchers had already come up with some experimental platforms. For example, there was a device that you could put a piece of tissue into and find out which genes were expressed in it and which weren’t. And they had just recently managed to finally sequence the entire human genome! You could look at every single gene and see what’s in its promoter, and then try to figure out how that’s related to gene activation. They built higher-level models based on that and tried to optimize them.
In my PhD work I used a genetic algorithm to solve a genetic problem, which is not as intuitive or expected as it might sound. I wrote a program in C++ to process experimental data, like gene sequences and the presence of certain regions in promoters. Using that data, the program would try to predict the gene situation in a cell, such as what proteins and what transcription factors are playing a role in the cell. This is what my thesis was about, which I defended in 2006.
Two years, that’s quick!
I guess so – I tried to avoid procrastinating. But it wasn’t only up to me. There are some graduate schools out there where no one can complete a PhD in less than ten years, no matter how smart or hard-working they are. At my university, many did it in two or three years.
Our firm then developed a whole platform that you could load data into, filter the gene lists, and sort and categorize them. It incorporated my analysis and a bunch of other algorithms, and it ended up being a pretty cool system.
Then 2008 and 2009 happened, and the firm ran into some hard times. I knew a few people at the bioinformatics lab of the Institute of Computer Science and Design Technology in Novosibirsk. They worked on a project called BioUML, an integrated environment for collaborative analysis of biometrical data. They took me in, and that’s where I picked up Java, which I’ve been actively using since 2009.
How did you become interested in static analysis? You worked in this area even before you joined JetBrains, didn’t you?
It happened more or less randomly. I’ve always looked at code as a subject of research – not just as a tool that does the job it’s designed to do. At some point I started paying more attention to the quality of the code we were writing. I learned about static analyzers, like FindBugs for Java. Back in 2013 it was still relevant, at least until Java 8 came out. So I installed FindBugs and analyzed our project, which was already quite large with 10 thousand classes. Boy, did I find some embarrassing things there! Talk about a facepalm moment. We were using Eclipse back then, which had a pretty weak built-in static analysis – it had some required compilation warnings but not much else. I was very surprised to learn that code errors could be detected just like that.
I got to work fixing the issues FindBugs identified in our code. It was going great until I realized the tool wasn’t always very accurate. Sometimes it failed to see some problems, and sometimes it reported things that weren’t problems. I found the project on SourceForge and gave them my feedback about what I thought could be improved. I reported some bugs and contributed a few patches at first. Later I started adding serious new features that FindBugs was missing, such as integer ranges analysis. My detector could find some interesting code smells. Let’s say you have an if statement like if (x > 0), and then inside it you check whether x == –1. The detector will tell you that this comparison is redundant because we already know x is positive. This was a real improvement.
At some point I realized FindBugs was a dead end. It was around that time that its founder Bill Pugh left the project and other big contributors stopped being active. But more importantly, the tool had an extremely cumbersome architecture with lots of global variables, which is why it couldn’t support multithreading.
Did you create your own fork?
It wasn’t even a fork. It occurred to me that I could write my own static analysis tool. You see, FindBugs analyzes bytecode, not source code. And it’s very low-level, too – it works with individual bytecode instructions. So if you want to create any non-trivial checks, it’s pure hell, meaning you have to jump through a lot of hoops. It doesn’t have a global model of the code, just a sequence of instructions. I tried to create a model similar to that in FindBugs, but I quickly realized it would take ages.
When I looked at other promising engines, I found JetBrains’ Fernflower and Procyon. I went with Procyon, mainly because JetBrains was not big on making Fernflower available as a standalone tool. I don’t know about now, but back then it didn’t even have Maven artifacts. I would have had to build it myself, on top of other things.
Procyon looked good to me; I could use it as a library. Under the hood it builds a high-level model way above bytecode, which lets you see the bigger picture – the nesting of loops, conditions, and so on. It is also pretty good at restoring generic types from bytecode after they are erased.
I enjoyed writing my own static analyzer at that level of abstraction. I called it HuntBugs and I spent quite a bit of time improving it. In the end it had more than one hundred different checks. I tried to implement most of the functionality of FindBugs and add some new things of my own. FindBugs had about 400 inspections (diagnostics). I think I covered about 30% of them before I ran out of steam.
How’s HuntBugs doing now?
It’s been abandoned, but it played a big role in me going to work for JetBrains. Before JetBrains opened its new office in Novosibirsk, the company announced its JetBrains Night event and they said there would be interviews. I realized it was a chance for me to continue doing what I enjoyed, but for money.
I also pulled a trick I thought was clever. I took the source code of IntelliJ IDEA Community Edition – its biggest JAR file, lib/idea.jar, was at least 100 MB in size – and I analyzed it with HuntBugs. It had a lot of results to show, some garbage here and there, but some of the findings were interesting. It managed to uncover some bizarre pieces of code in the files. So I wrote an article on Habr about it, basically to say, “Look, I have this cool analyzer called HuntBugs, I checked IntelliJ IDEA with it, and here’s what I found.” I mean, IntelliJ IDEA Community Edition had its own static analysis that was supposed to be strong, and its contributors must have been vigilant about the quality of their own code. But thanks to HuntBugs I still managed to find a few things to nit-pick and improve.
When you wrote that article, did you already know JetBrains was going to open an office in Novosibirsk?
That was before I went to my interview, but after I had scheduled it. So yeah, my goal when I wrote the article was to put my name out there.
You also had a high rating on Stack Overflow, as well as other published Habr articles and a Twitter account. Did that, too, play a role in you getting the job?
I think so. And then there was also that earlier story about streams (the Java Stream API) that Java 8 introduced. I really got into streams for a while. At first I was like, “Wow, I can do so much cool stuff with streams so easily!” – only to realize that some things couldn’t be done out of the box. I began to write my own library I called StreamEx. I put it on GitHub and published an article on Habr describing a better way to work with streams.
However, I wanted to promote my library and expand beyond Habr’s all-Russian audience. I went to Stack Overflow to see what kind of questions people were asking about streams. First of all, those questions could help me decide what new features to add to StreamEx. Second, I could post answers like, “You can’t do what you want using just the vanilla Stream API. Well, you could, but it would be ugly. But there’s this library – free and open-source, by the way – that can help you solve your problem nicely and easily.” According to the Stack Overflow rules that was OK, unless the original poster specified they didn’t want to use any third-party solutions.
I kept answering questions and soon enough I got addicted. It grew into something bigger than just maintaining StreamEx. Stack Overflow does gamification just right, with ratings, badges, charts, and so on. So I spent about a year doing that, mostly to do with streams but also some other related topics. That’s how I met Oracle people like Brian Goetz and Stuart Marks, who would also pop in and answer questions. I even found a bug in the API and reported it by posting a question on Stack Overflow.
Did they respond?
Yep, Brian did. They opened a ticket in the OpenJDK tracker and fixed the bug in one of the following Java 8 updates. But I also learned that Stack Overflow is not the best way to report bugs or get through to the right people. The mailing list was the right place to do that. I had no clue about the Java development process. For example, I didn’t know that lots of things happen out in the open, such as in mailing lists. I signed up, joined the discussions, and then gradually began pitching in. After a while I became a contributor, and then an author and committer in OpenJDK.
During my interview with JetBrains, I did mention StreamEx, but we mostly talked about HuntBugs. Because of those two open-source projects, I didn’t have to do a test assignment. The interviewers just looked at my code and asked me how I chose those solutions and why I wrote the code the way I did. You can tell a lot about a developer’s skill by just looking at their code.
Does your internet presence keep influencing your life? Are people inviting you to join their events or go to work for their companies?
Yes, and I attend conferences, too, so I do get lots of invitations to talk at events and give lectures. I often have to say “No”, though, because I just don’t have the time or the resources. I don’t like to give the same talk multiple times, and preparing new ones takes a lot of time and energy.
Even though you decline a lot of those invitations, you still give quite a few talks, and you teach, too! Do you like working with an audience?
I love sharing knowledge. When I know something that others might not, I feel great telling them about it. But the talks and lectures are very time-consuming, since I have to prepare everything thoroughly and rehearse multiple times. Sometimes I feel like giving up. But when the talk goes well, it’s exhilarating. People are eager to learn, they listen and ask questions, and you know you’re being useful. In the end, they’ve learned something and they didn’t look at their watches (most of the time). That tells me that maybe I can make a talk interesting.
I’ve listened to some of your talks and I was fascinated. Writing is about sharing knowledge, too, but you seem to prefer giving talks. Why is that?
I guess the formats are a little different. I gave my first talk at the Joker conference in 2015. That’s when I became interested in learning how Java compiles code, and I saw some incredible optimizations there. I began to write an article for Habr about that, but it was coming out kind of uninspiring. You see, it was a bit of a detective story: we try this one approach but it doesn’t work, but then we try this other approach, and bam, it works… but we don’t know why! It occurred to me that if I made this content into a talk, it would be a lot more exciting than just a piece of text. I could do a show, insert some dramatic pauses and things, you know. So I did, and I liked how it came out.
JetBrains offers relocation to its other offices in Russia and other countries. What’s keeping you in Novosibirsk?
I don’t see a big benefit in moving. I think the advantages and the drawbacks cancel each other out. Take climate, for example. You never know about the future – especially considering global warming, Novosibirsk might well end up being a better place to be.
Moving to another country presents its challenges, too, like language barriers, social adaptation for your children, not having friends to spend time with, being away from family… lots of basic things that you have to do all over again, from scratch. As a stay-at-home kind of person, I don’t relish abandoning everything and sailing to distant shores.
Besides that, the nature of our work lets us work from anywhere. Wherever you’re based, you look at the same monitor screen and write the same code. Novosibirsk is cheaper than Moscow or St. Petersburg, too. I can afford a better quality of life here than I would there. This is especially true for things like getting your children into a private kindergarten or using private healthcare providers.
And then there’s patriotism. I’ve lived in Siberia all my life and I feel I can make a difference here. For example, I could help set up interesting conferences, maybe make the region more attractive – turn it into a place that people want to move to rather than move away from. Our planet is huge, so why should we all live in the same few places? Why not spread out a little, you know?
Speaking of that, you work in a very distributed team: Novosibirsk is 4 hours ahead of Moscow and St. Petersburg and 5 to 6 hours ahead of Munich. Has the move to remote affected how you and your teammates work together?
Before lockdown, there were three of us Java team members working in the same room in the Novosibirsk office. We could talk about all the work matters face to face. Sometimes we would solve coding problems by sitting down together and pair-programming. I miss those times when we could strike up a conversation with a colleague across the room at any moment. I feel we’re more isolated now and less aware of what’s happening. But then again, even before the lockdown many of my teammates were based in other places like St. Petersburg and Munich, so the change is not that huge. And of course, we have weekly team meetings and one-to-one calls, plus I try to go to our virtual “standup” meetings almost every day.
There’s also this new meeting format called “virtual coffee”. It’s a set time every day when our whole team can get together online and chat about non-work-related things. People talk about life and just whatever is important to them, and this has helped us better connect with some colleagues who are based in other offices.
JetBrains has something called “20% projects”, where you can work on something outside of your job description. Are you taking advantage of this?
I’m impulsive by nature. When I get an idea to develop a new feature, I drop everything and start coding it. So scheduling 20% of my time every week to work on something specific doesn’t sound like it would fit my work style. This is also why I’ve never participated in hackathons – they just aren’t my cup of tea. If I have a bright idea, I’d rather get working on it right away. And if I don’t have one, it won’t come to me during a hackathon anyway.
I do devote some of my time to things not directly related to my job, such as helping shape the future of Java. I’m part of the Amber Expert Group, which screens all new Java features, like records, pattern matching, switch expressions, and so on. I’m also fairly active in discussions, editing technical specifications drafts, and sometimes committing patches. I’ve never kept track of how much of my work time is spent on this, but it’s probably less than 20%.
You joined JetBrains as a senior developer and have since grown into a tech lead. How did that affect your tasks and workload?
I’m being invited to a lot more cross-functional meetings now, as well as the mandatory team leads’ meetings. I spend more time discussing cross-team collaboration, and I do more mentoring. This is all good, but management is not for me. I consider myself a developer and, naturally, am striving to keep programming as much as possible.
How does your team discuss work matters and make decisions? Do you take your own approach?
Some discussions don’t have to involve the whole team. When one of us takes on a problem, they are usually knowledgeable enough to decide how to approach it and how to solve it. The tech lead doesn’t pass decisions down to other team members, either. Each one of us can speak their mind about how to approach a problem they’re tasked with, and their opinion is often the decisive factor.
Some decisions, though, are about more than just coding, for example if they affect the user experience. Let’s say, someone is asking us to create a new inspection or warning. First, we have to decide whether it’s something we want to do at all. If the requested feature is of limited importance, or maybe we’re not sure we can eliminate all the false positives and make the inspection accurate enough, we might not go ahead with it. Or it might be related to a third-party library that we don’t want to support globally. We make sure to discuss these kinds of things as a team and reach a decision together.
We wrote a guideline a couple of years ago that describes the rules for processing such decisions. It helps us answer questions like, “Do we want to add this new warning or not?” and, if we do, “Should we make a totally new inspection or incorporate the warning into an existing inspection?”. If we created a new inspection every time, soon enough our users would end up being inundated and disoriented by so many different inspections. On the other hand, if we added a new warning to an existing inspection, users would not be able to switch it off separately from all the other warnings that the inspection can generate. We discuss questions like this in our internal Slack chat and reach decisions as a team.
My point of view isn’t always right. In general, the more experience someone has in the area, the better qualified they are and the more weight we give to their opinion, regardless of their job title or role in the team. The more experienced colleagues are more likely to have dealt with that specific part of the codebase. But if a new team member digs into a subsystem and works on it for, say, a month or so, soon enough they become the resident expert in it, even if they’re less experienced overall, and we’ll give the most weight to their ideas and suggestions.
What do you do for fun? I’ve heard you’re fluent in Japanese.
I’ve been focusing on my family, so I’ve had less and less time for anything you might call a “hobby.” The things I do for fun now usually revolve around my work, like attending conferences and helping develop Java. When I spend the weekend preparing a conference talk and my son asks me, “Daddy, are you working?” I say, “Nope, big man, I’m having fun.” Because you’re not supposed to work on weekends, you know?
A long time ago I used to have hobbies that had nothing to do with programming. I did study Japanese for six years, and I even passed Nihongo Nōryoku Shiken, the Japanese-Language Proficiency Test. I passed the level 2 of the exam, with level 1 being the highest. This was back in 2006.
Why Japanese?
I was a fan of anime. Well, I was a fan of Pokémon and then I got hooked on anime. I would first watch an episode dubbed and then watch the original, just trying to get a basic idea of what they were saying.
I’ve always had an interest in, and some aptitude for, languages. I could memorize words well. So that’s how I fell in love with Japanese. It is beautifully structured, with interesting grammar and very simple phonetics. Much nicer than English with its impossible vowel sounds that I can never get right!
But the Japanese writing system is more difficult, isn’t it? It has three different scripts and so many characters.
That’s never bothered me. You simply memorize each word visually. People in all parts of the world now speak in emoji, don’t they? This trend came from Japan – “emoji” is a Japanese word. The Japanese think in pictures – the Kanji words are nothing but pictures. You memorize their meanings, and that’s it. And if you come across a word you haven’t seen before, you’ll likely know one of the two characters it consists of, and you can use that to deduce the meaning. There are many characters, but not too many.
I learned about 1000, maybe 1100 words, plus the two writing systems, each of which has 50 characters. 1100 or 1200 is a pretty realistic number. But English isn’t all that different, really. I think of English words as hieroglyphs of sorts, too. When you’re learning English, you see an English word and you have no idea how to say it correctly. So you have to learn three different things: how the word is spelled, how it’s pronounced, and what it means. Japanese is the same. And then, as you become more experienced, you start seeing patterns and you can sometimes guess the meaning of Japanese words you haven’t learned yet.
Even though I like Japanese, I’ve had to put it on the back burner. But lately we’ve been running a huge localization project for IntelliJ IDEA, and I’m now using its Japanese version. It does slow down my work, but it’s helping me refresh my Japanese skills.
So you’re actually coding in the Japanese version of IntelliJ IDEA?
That’s right, it’s fully localized and I’m using it in my daily work. Sometimes I look for something in the user interface and I can’t find it. The worst case scenario is when I need to find an inspection that I know the English name of. So there’s this list of 2000 inspections, all in Japanese of course, and I need to choose the correct one. A tough problem! But other than that, it’s working just fine. I get a nice nostalgic feeling looking back on all the Japanese I used to know.
It’s mostly practice for my own edification, but once in a while I notice translation errors and I let our localization team know about those. I’ve found a couple dozen things to improve, so it’s been a helpful activity.
Alina Komissarova, Coordinator of JetBrains Educational Projects in Novosibirsk