Kotlin Developer Advocate Eugene Petrenko talked with Maria Antropova, Head of the JetBrains Marketing Research department, about her work, her team, their current tasks, and future plans.
Your team has been around for quite a while. How did you start out and what kinds of research do you do?
It was back in 2012, when I came to the ReSharper team as an analyst. As more analysts joined me over time, we formed a team as part of the Marketing department. We started with product surveys, market research, and sales analysis—small-scale tasks that addressed our colleagues’ ad hoc requests at the time.
Today, the Research team is involved in not only traditional marketing research, which mostly includes surveys and analysis of open-source information. We also conduct pricing research, design user personas, analyze the popularity of various technologies, do UX research, and build statistical models. Another area of focus for our team is serving as a kind of middleman between our colleagues and the internal statistical systems at JetBrains. Upon request, we export and analyze data from a variety of internal and external sources.
Most of our tasks are associated with surveys. In our annual Developer Ecosystem Survey, we study the developer ecosystem as a whole and explore its parts in projects such as the Python Developers Survey which is conducted together with the Python Software Foundation. By the way, right now we are conducting our third Developer Ecosystem Survey and invite everyone to participate.
How did the Developer Ecosystem Survey come about?
The idea of running a survey like this was already being considered a long time ago. Let me check quickly… the first related request in YouTrack dates back to November 4, 2013. In the beginning, it was just a concept. I was seeing other reports about the developer community and felt that it would be cool to explore the ecosystem, to see what’s going on there, what makes it tick, and how it changes with time. Businesses require global, representative, and easily accessible data, but there just wasn’t much available. However, at that point, there were no articulated requests from other teams for such a large-scale study, and, within the Research team, we could not yet tell exactly how we might be able to apply any potential results. This is why we did not launch this project back then.
After a while, our understanding of its potential started to take shape and we began putting the survey into production. It quickly became our team’s flagship ongoing project. What had changed? First of all, at some point, as we added more and more surveys to the pipeline, it made more sense to optimize things by running one large-scale annual survey.
So how we do this is we combine the collected survey data with our statistical model, which uses macroeconomic data to predict the number of developers in different countries and the distribution of programming languages across the world, and this estimates the popularity and viability of various technologies.
Second, we felt that we had gained enough expertise and fine-tuned our survey distribution and result validation processes, enough to tackle an annual survey of this scale. Importantly, other teams have found our work useful and practical, which really encourages us and keeps us motivated.
Today, we conduct about ten surveys every month (including internal, event, and partner surveys), but the Developer Ecosystem remains our biggest source of useful data and insight.
How difficult is it to run the Developer Ecosystem Survey? How much time does it take?
To give you an idea of the scale, we estimate it’s going to about nine months to go from designing the Developer Ecosystem Survey 2019 to publishing the infographics. Enough time to have a baby! The reason it takes so long is there’s just an awful lot of questions. Every year we have to double-check the logic of the survey, make sure that all the questions are still relevant, and add new ones, if necessary. To do this, we coordinate the content with every product team and the Marketing department.
Collecting data also takes time: we must clearly define the distribution channels for the survey and gather the required number of responses.
After the first Developer Ecosystem Survey, some of our colleagues at JetBrains raised questions about how representative our results were. After all, much of the audience we reach is part of the JetBrains community, which may differ from the developer community as a whole—just as it would with any other company. Keeping this in mind, we distribute the Developer Ecosystem Survey in two ways: through our company’s channels and advertising channels like Twitter, Facebook, Google AdWords, and others. We are aware that there is still room for bias, as people are more likely to take a survey conducted by a company they know. However, having analyzed the results, we found out that they weren’t that different across the distribution channels (except for the questions dealing with JetBrains products), and were pretty consistent with the findings from third-party studies.
You’ve mentioned that you received the task for this survey through your issue tracker. I’d like to know more about your team’s process. Who can approach you with a task?
We mainly respond to requests from other teams, and in doing so we’ve been trying to follow the principles of internal consulting. This means that we not only run the studies requested by the ‘customer’ and report on the results, but we also work with them to find ways to apply these results practically and benefit from them. Sometimes it works, sometimes it doesn’t, but the approach seems to have its merits. In fact, anyone at JetBrains, regardless of their position, can approach us with a task, propose an idea, or discuss some hypotheses with us. If for some reason we cannot give them a good answer, we tell them so.
Recently, we’ve been initiating more and more studies ourselves. We do this at our own risk. Without an external customer, we do the work ahead of time, just in case someone needs it; but on the flip side, there’s no guarantee that anyone will ever need it. Then again, that’s how almost all of our projects started—with us looking for colleagues in other teams who might be interested in one topic or another that we proposed. When the benefits are tangible, the project comes to life. This strategy has been working for us, and it’s a lot like how product development works at JetBrains, too.
Absolutely. If you don’t know for sure if an idea will take off, you try your best and see what happens. Still, your area is different from software development. What challenges do you face, for example when interacting with colleagues?
Conducting a research study involves a lot of complexities associated with the design and implementation—the kinds they write about in all the textbooks and talk about in classes. All kinds of bias tend to come up, and in quantitative studies, there’s also the issue of data quality.
As for our colleagues, we always have to justify the validity of what we are doing for them. For example, if a customer is unhappy with the results of a survey, various reasons may be at play. There may or may not be some professional fault on our part. In such scenarios, it is very important to exchange opinions and figure out whether there’s a mismatch between the customer’s expectations and the results we’ve obtained, or if it was us who missed something. This is more typical of technology-specific studies, where we don’t analyze markets but deal with more special product-related tasks.
JetBrains makes over 20 products, all of them targeting different technologies and different markets. No matter how hard we try, we cannot be experts in all the areas of our company’s work, even though everyone on our team has experience in software development and solving applied problems in programming. This is why we are now trying to engage our colleagues from product teams as experts at all stages of research, and this is helping a lot. At some point, teams may start hiring new analysts to help with their specific internal product-related tasks. This is already happening to a degree.
How is your team’s work organized on large-scale projects? Do you find yourself needing a manager, or a ‘communicator’?
‘Manager’ would be an overstatement. We have no managers who would only manage how others work on a project. We have a lead analyst assigned to every project, and they take care of all the administrative tasks. I manage the team as a whole, but I also have research tasks of my own. Besides, I always review the findings of my teammates.
We also have a Technical Lead on our team who takes care of the entire infrastructure and is responsible for the quality of the data we use. Then there are two developers. One specializes in databases and helps us with SQL and ETL. The other is engaged in full-stack development and helps us visualize the reports and prepare the infographics. We also have analysts who do market and quality research. And then there are the technical analysts who actually are data scientists, it’s just that we are so used to calling them this way that it has kind of stuck. Speaking of data scientists, we are currently looking for one and have a job posting on our website. We are hoping to find a cool person to join our team and help us build statistical models.
By the way, I am pretty happy with the latest trends in the job market. Five years ago, when we were looking for our first technical analyst, we didn’t have a lot of responses and there were only a few suitable candidates. Most of the applicants worked with SPSS, a package for statistical computing with a graphical user interface. That’s not R where you write code—it’s a system designed for sociology and psychology studies. You can write scripts there, too, but they are not quite as flexible or very practical.
Can you tell me more about the technologies you use?
R is our primary language—all the automation and data processing scripts are in R. Surveys are almost entirely automated. After we receive the data, it is processed with an R script (which we write before launching the survey), and the report is automatically generated in a Google Doc, which contains all the charts. All that’s left to do then is to draw conclusions manually.
In terms of other technologies, we run builds in TeamCity and keep our code on GitHub.
Is it true that a minor data-processing error can be fixed within minutes?
It is now. We can change a few lines of code, and a new Google Doc will be ready in a matter of minutes.
In fact, data processing and report generation have become some of the fastest stages in the whole process.
Not so long ago, we decided to completely reconsider how we work on surveys. Now we write data processing code before launching a survey. This allows us to process the data instantly, test the survey, and catch logic bugs as they appear. Logic defines the branching of a survey based on the respondent’s answers. When someone takes a poll, they are smoothly redirected through its different branches.
So, we write code, test what needs to be tested, and then perform manual testing. Next, we launch an internal pilot to try out the survey on a small group of JetBrains employees. Then we do the real pilot, using the first 100 responses from the actual survey audience to make sure everything is fine.
When we need help from designers, we try to involve them as early as possible. Many teams contribute to the making of a survey and preparing the infographics, including: email marketing and Internet marketing specialists, web developers, designers, translators, product marketing managers, and others.
How do you use SurveyGizmo? Does it show results in real time?
SurveyGizmo offers real-time reports, which are good for rough estimations. But it’s noisy data (with bots, duplicates, and fakes), so you need to export it and prepare a report. To develop our surveys in SurveyGizmo, we use the UI. You can’t make a whole survey through API alone—the logic still has to be written manually. Naturally, we have a library of reusable questions. But even though we have perfected the wording over the years, every survey is subject to proofreading. We translate the Developer Ecosystem Survey into eight languages to counter any bias in favor of English speakers. We use the SurveyGizmo API to purge data in order to stay compliant with GDPR. Survey results are exported through the API as well.
What else has your team been up to?
A year ago, we started analyzing local markets. We’ve been exploring promoting our products in specific countries and regions of the world. Entering a new local market is a tough proposition: you need to understand what’s going on there, how it’s different, how the competition is doing, whether you’ll have to localize your products or if the default English version will do, and so on. We’ve compiled a list of countries for analysis using a statistical model that takes into account many different factors, and then coordinated the list with the Sales department. For each country, we conducted PESTLE analysis, investigated the IT sector, and analyzed our internal statistics.
Another interesting project we’re running is coming up with user personas. A user persona is a fictional representation of a group of people who use a company’s products, services, or communication channels in a similar way or pattern. Even though they are generalized fictional characters, personas usually have a face, a name, and some distinctive features of their biography. So far, we’ve interviewed PyCharm users and designed user personas for the PyCharm development team.
How does someone go about joining your team? Where can they apply?
If what I’ve just told you strikes a chord with someone, they can always send a CV to our HR department. We are ready to consider promising candidates, even if there isn’t an open position. The ideal Data Scientist for us has programming experience and codes in R or Python. Knowledge of SPSS, MATLAB, or Statistica is not required as these systems are not part of our technology stack. It is necessary though to have experience in creating statistical models. We are looking for people who are fluent in English with good writing skills, which can be a decisive factor for certain roles. For example, right now we are looking for a UX researcher with an excellent command of English to communicate with our English-speaking users.
What’s next for the Research team?
As the company is growing rapidly, we are receiving more and more tasks, and we have several open positions which we need to fill with skilled new colleagues. We will continue to make our processes more efficient by automating everything that can be automated. We are also planning to adapt and adopt several new types of research studies and will be paying more attention to how their findings are applied.
Thanks a lot, Maria!