Tutorials

Does Machine Learning in Go Have a Future?

Read this post in other languages:

This article was written by an external contributor.

Sooter Saalu

Sooter Saalu is a data professional with experience as a data analyst, data scientist, and data engineer. He has an educational background in clinical psychology and is committed to writing engaging data stories for technical and non-technical audiences.

GitHub

Go is an open-source programming language originally designed by Google to optimize the construction and use of system-level services, work easily on large codebases, and utilize multi-core networked machines. Initially introduced in 2009, Go, as a statically typed and compiled programming language, is heavily influenced by C with a focus on simplicity, safety, and concurrency.

Go has most notably been used to create large-scale applications such as Docker and Kubernetes. In addition, it’s widely used in companies such as Netflix, PayPal, and Uber due to its low latency, efficient cross-platform performance, and easy scalability. However, despite its many advantages, Go has not been typically used in machine learning (ML).

In this article, you’ll explore the challenges of actively using Go for ML and the ways in which Go can gain a foothold in the ML space over time.

Main challenges for using Go for machine learning

Go is a powerful and efficient programming language and is fast and performant enough for the CPU-intensive, high-computation calculations that AI solutions require. It’s faster than Python and offers various advantages, such as ease of use, efficiency, and concurrency, making it superior to several other languages used in ML in certain cases.

Despite Go’s potential for creating robust and scalable ML applications and even its superior performance compared to some of the competition, it remains an overlooked option for ML. Go’s low adoption in ML can be mainly attributed to some significant challenges it faces, which older programming languages in the ML space have already resolved. These challenges include a lack of high-level libraries, no native bindings for CUDA, and less well-developed statistics and visualization capabilities.

Lack of high-level libraries

As a relatively new language, Go has a much smaller set of tools and libraries compared to other languages that have been around for decades and have well-established ecosystems and libraries for ML. As a result, Go has fewer high-quality libraries and tools available for ML tasks.

This means that developers who want to use Go for ML have to spend more time building custom solutions or integrating with existing libraries and frameworks that were not designed specifically for Go.

There have been significant improvements in Go libraries over the years, with GoLearn offering scikit-learn-style fit and predict capabilities, as well as test-splitting and cross-validation utility functions; GoMind offering neural network capabilities; and Gorgonia, a graph computation ML library similar to TensorFlow with scaling capabilities.

Overall, even these libraries do not have the depth of Python-centric libraries with decades of development, and there are capabilities where significant gaps can still be felt in Go, such as in natural language processing (NLP) tasks, especially when compared to spaCy and NLTK.

No native bindings for CUDA

Compute Unified Device Architecture (CUDA) is a parallel computing platform and programming model developed by NVIDIA for programming graphics processing units (GPUs). It allows developers to use the high-performance computing capabilities of NVIDIA GPUs to accelerate a wide range of applications, such as ML, scientific computing, image and video processing, and more.

The main advantage of CUDA is that it allows you to use the massive parallelism of GPUs to speed up computations that can be parallelized. Unfortunately, Go doesn’t have native bindings to CUDA as Python does.

To utilize CUDA in your Go code, you need to import and utilize C functions that create CUDA bindings. To be fair, C code is embedded in Go with the cgo command, which enables the creation of Go packages that call C code. However, reliance on C code and cgo requires a strong proficiency in C for efficient coding and debugging. It can create significant overhead while also opening you up to C-specific issues, such as memory safety cases and security vulnerabilities.

There are also third-party code packages with utility functions for Go bindings for CUDA, such as cuda.

Experimentation constraints

Inherently, Go is not the best for experimentation. As a compiled language, Go code is transformed into machine code that can be executed directly by the CPU rather than being interpreted by a runtime environment at runtime. This feature is helpful as it contributes to Go’s speed and efficiency; yet, you cannot write and execute Go code without compiling it first. This makes it relatively harder to try out different ideas and test different approaches to a problem in Go than in interpreted languages like Python, and R (note, however, that Go can be used as a scripting language).

Go is not as abstracted from the underlying hardware as some other languages. This can be a plus for tasks that require low-level optimization or close control over hardware resources, but it can also make Go code more verbose and require more up-front setup and configuration compared to languages like Python, which can be more flexible and easier to work with in some cases. Considering the depth of libraries and frameworks like scikit-learn and TensorFlow, and the availability of easier solutions for tasks like feature extraction, clustering, and dimensionality reduction in one package, Go is not the most ideal choice for experimentation in ML.

Compiled languages like Go are usually a better choice for high-performance tasks, such as server-side programming and optimizing real-time applications.

Insufficient math and statistics capabilities

As mentioned, Go lacks the depth of industry heavyweights such as Python. It has fewer specialized libraries and packages focused on statistics, calculus, and matrix manipulation, which is a large part of ML and artificial intelligence development.

This might not be a major disadvantage for everyone; some developers welcome the chance to actively write out the code for their ML algorithm or math logic. However, it still means that Go’s ease of use for the same data manipulations, analysis, and predictive algorithms is relatively poorer compared to Python.

Overall, Python, R, and Julia have a strong foothold in the ML community because they were here first. So, positioning itself as a viable alternative is an uphill battle for Go, particularly because these established programming languages continually develop to be better, easier, and more efficient for ML and AI capabilities.

High-level libraries in Go

Go has some high-level libraries like Gonum, Gorgonia, and GoLearn, which provide tools for building and training neural networks, performing numerical computations, and other ML tasks.

However, they are not as feature-rich as their Python library counterparts. These Python libraries and frameworks, such as TensorFlow, scikit-learn, and spaCy, are popular in the ML industry and have been created and iteratively developed to fit the needs of ML developers. They offer features for natural language processing, image embedding, neural networks, and other ML interests.

You could achieve these same capabilities in Go, and they would be potentially more powerful and efficient due to Go’s advantages over other programming languages such as networking, concurrency, and data processing.

However, at the moment, creating these capabilities is only useful for the Go community, as Go isn’t as popular or widely used in the ML community as some other languages like Python or R. It also suffers from a large gap in contributors compared to its older counterparts, leading to a comparatively smaller ecosystem of libraries and tools, which can make it harder to find preexisting solutions to certain problems.

Is the situation changing?

Go is becoming more popular, and the community as a whole is growing. Currently, StackShare puts the number of companies using Go at 2,751, including Uber, Twitch, Shopify, and Slack. Additionally, according to the 2021 Stack Overflow Developer Survey, Go is used by around 9.55 percent of developers, ranking it as the 14th most popular programming language.

While this is good news for the Go community, it doesn’t really translate to the ML space. Go is most notably known for its capabilities in creating scalable servers and large software systems, writing concurrent programs, and launching speedy and lightweight microservices. However, even the official Go website does not list ML as one of its notable use cases.

This is indicative of Go’s current leaning away from the ML space and its lack of a foothold in the ML community.

Future use cases

There haven’t been notable showcases of Go in ML. However, looking at the strengths of the language, it’s possible to pivot from thinking of Go as a language for developing ML models and instead look at it as a language for serving ML models.

Go can be used to build ML model servers, which allow models to be accessed and used by other applications or systems. This can be useful for deploying ML models in production or for building ML APIs that can be accessed by other developers or users. It can be used to build ML applications such as ML-powered recommendation engines or natural language processing tools. It can also be used to build the backend infrastructure or user-facing interfaces for these types of applications.

Creating more toolkits and frameworks that enable faster and more efficient server-side ML could be a useful way for the Go community to open up the codebase for more experimentation in the ML space.

Conclusion

In this article, you explored Go’s potential advantages for machine learning, the challenges currently preventing the widespread use of Go in this field, how improvements can be made, and what future use cases for Go might look like.

image description