Datalore
Collaborative data science platform for teams
Datalore On-Premises or Cloud: Which Suits You Best?
In an era where data is the new currency, the ability to quickly gain actionable insights can be a game-changer for businesses and research institutions alike. Shortening the feedback loop between data scientists, analysts, and business intelligence teams can lead to more agile and responsive strategies, ultimately speeding up innovation and optimizing operations.
JetBrains has always been at the forefront of this challenge, delivering best-in-class tools, including Datalore – the collaborative data science platform for analysts, business teams, and anyone else who needs quicker insights from their data.
One of the advantages of Datalore is that it offers two different operational models: On-Premises and Cloud. In this post, we will consider a few advantages of each model and cover the most typical challenges organizations might face when adding a new tool to their daily portfolio.
When is Datalore On-Premises preferable?
Datalore On-Premises is a self-managed installation in the environment of your choice – a private cloud, a public cloud, or even your own bare-metal server.
Working with internally hosted data
Many companies host their databases fully on-premises instead of migrating them offsite. The reasons vary, from compliance factors to cost savings. However, this can lead to a problem. If the data is hosted locally, but the service that needs this data is located somewhere outside of the corporate perimeter, then the data becomes inaccessible to the service.
That said, on-premises deployments of data-consuming or processing services are the best solution in cases where you’re working with internally hosted data, as you have full control over the networking and security aspects. This allows you to customize your configuration without jeopardizing any security measures your organization has in place.
Extended compliance requirements
Considering the nature of your data is important when choosing the right tool for processing it, as specific industries may impose additional requirements for data handling systems.
For example, if a US-based organization wants to process health-related data, compliance with HIPAA (the Health Insurance Portability and Accountability Act) is required, while compliance with PCI DSS is necessary in the global financial sector. These requirements are often eventually mandated by law or industry standards to apply to both the product and the organization as a whole.
In certain cases, as long as the data doesn’t leave the organizational perimeter, the product itself doesn’t have to undergo the whole process of vetting, testing, and certification by an independent third-party authority, like the Office for Civil Rights or the National Institute of Standards and Technology.
If you work in a context with extensive compliance standards, on-premises deployment is preferable. Otherwise, your choice of tool vendor becomes significantly limited, as both the tool and the vendor need to be in compliance and hold the necessary certifications, which are expensive and difficult to obtain.
JetBrains is committed to maintaining the highest level of security when it comes to our data. An annual review by our external auditors recently confirmed our SOC 2 Type II compliance status.
Specific environment requirements
Another case where on-premises installations are particularly suitable is when there’s a high demand for customization, which is often something that SaaS platforms either can’t provide or can only provide in a limited capacity.
Here’s a story from one Datalore customer who decided to go with an on-premises deployment:
By using Datalore On-Premises, we can customize the environment using Linux shell scripts built into the agent image used by Datalore. We can also install our own packages using pip, Poetry, dependency files, and more without any restrictions. This reduces the environment bootstrapping time, which is essential for us as a fast-paced team.
When is Datalore Cloud preferable?
Datalore Cloud is our software-as-a-service offering, managed and operated by JetBrains.
No-ops strategy
Depending on your organizational goals and priorities, it may make more sense to completely avoid having anything on-premises, including servers and data storage. Instead, you can use managed services by various cloud providers, allowing you to focus on your daily tasks rather than worrying about infrastructure management.
Datalore Cloud is particularly advantageous for organizations following a no-ops strategy because it eliminates the need for dedicated IT staff to manage hardware or software updates. Additionally, its extensive list of machines provides workload scaling capabilities, ensuring optimal performance as data workloads grow and reducing your organization’s operational burden.
Starting your data journey
When a team begins a project, they usually need to choose their infrastructure and tooling, a process that can be lengthy enough to have a visible impact on their timeline.
Datalore Cloud speeds this process along because the only thing you need to start using it for data exploration is your browser. It also comes with a no-commitment 14-day free trial, allowing you to easily determine whether it meets your needs.
Once you’ve signed up for Datalore Cloud, you’re ready to explore your data immediately. With any of the paid Datalore Cloud tiers, you get 750 hours of computation time using 4 vCPUs and 16 GB of RAM (2 vCPUs and 4 GB of RAM for free tier users). We’ve found that these resources are sufficient in about 90% of cases, but if you need more, you can scale up with just a single click. Datalore Cloud has an extensive list of machine options that will suit even the most demanding users.
Flexibility
Your company’s tooling landscape can change rapidly, as your business requirements evolve together with your team. Because of this, it may not be wise to commit to the fixed, long-term seat capacity offered by Datalore On-Premises.
For Datalore Cloud, you have more flexibility in terms of seat capacity adjustments, with an option to scale your team’s capacity based on demand and your current requirements. Additionally, having the flexibility to choose between monthly and discounted annual commitments is a plus.
Another important aspect in choosing between the deployment models is the pricing structure. On-premises solutions typically carry an infrastructure setup burden, both on hardware and people, that increases its total cost of ownership.
Given the above, Datalore Cloud might be more beneficial if you have a demand for computation-intensive tasks but you either don’t have the expensive hardware required or don’t want to invest heavily into it. In that case, Datalore Cloud offers state-of-the-art environments prepared with all of the necessary resources at a fraction of what the hardware would cost.
Try Datalore Cloud 14 days for free
I hope this article helped you get a better idea of which key factors to consider when deciding between on-premises and cloud computing for your business needs.
If you have any remaining questions, do not hesitate to schedule a call. We’d be happy to discuss your specific requirements in more detail.