Datalore logo


Collaborative data science platform for teams

Python Python for Finance

Portfolio Optimization in Python With Datalore and AI Assistant

The financial world is a gold mine when it comes to data-driven insights, and fintech companies are increasingly turning to Python for its capacity to handle complex portfolio optimization tasks. In this blog post, we will explore the essential Python tools and libraries for portfolio optimization, walk through the process of calculating fundamental portfolio metrics such as lognormal returns and Sharpe ratios, and outline how an established optimization strategy – mean-variance optimization – is applied in practice. We’ll also show how Python newcomers can use AI Assistant to tackle these tasks and how results can be effectively shared with stakeholders through Datalore’s reporting features.

If you’ve had Python experience before, you can take a look at an example notebook of portfolio optimization here. If you’re relatively new to Python, read on.

Open Datalore notebook

💡 To see the full code, click Sign in and create a personal Datalore account. Then click on three dots near the Recalculate all button in the top right-hand corner and select Edit copy.  To try Datalore free for your team, visit

🚨 Disclaimer: This article and notebook are for informational and educational purposes only and are not intended to serve as personal financial advice.

Why use Python for portfolio optimization

Python is a great choice for portfolio optimization thanks to its ability to:

  • Efficiently process large datasets.
  • Effortlessly query financial market data through yFinance, Alpha Vantage, and other APIs.
  • Provide specialized packages such as PyPortfolioOpt, scipy, cvxpy, and arch, which facilitate intricate optimization tasks and enable the development of advanced investment strategies.

For individuals with a background in finance looking to apply Python for portfolio optimization, the good news is that Python proficiency is not a prerequisite, as Datalore’s AI Assistant can help bridge the gap. Moreover, thanks to Datalore’s reporting capabilities, sharing detailed and reproducible results with stakeholders is a straightforward and swift process.

Calculating fundamentals: logarithmic returns, Sharpe ratios, covariance matrix, and portfolio weights

In almost any portfolio optimization task, our aim is to find the optimal weights for each asset that will maximize the portfolio’s risk-adjusted returns, considering the covariance among assets. Now, let’s dive into each concept separately.

Logarithmic returns

Why use logarithmic returns? Logarithmic returns, often called log returns, have several advantages over simple returns:

  1. Time additivity: Log returns are time additive. For instance, if you want to calculate the total return of an investment over multiple periods, you can simply add the log returns of each period. This isn’t the case with simple returns, which have to be compounded.
  2. Symmetry: Log returns are symmetric in gains and losses. This means that a 10% gain and a 10% loss in direct sequence will bring you back to your original starting value. This symmetry simplifies many statistical analyses.
  3. Normality: In finance, it is often assumed that returns are normally distributed. While this assumption is never strictly true, it is often closer to being true for log returns than for simple returns, especially over short time horizons. This assumption leads to many useful properties and makes modeling easier.
  4. Practicality: Log returns are more practical when dealing with large datasets or processing extensive computational tasks since logarithms, especially natural logs (base e), are more efficient to calculate and have more desired properties than simple returns.

To calculate logarithmic returns in Datalore, you can use the following code:

import numpy as np
for df in dataframes:
   df['Daily Return'] = np.log(df['Adj Close']/df['Adj Close'].shift(1))

Sharpe ratios

The Sharpe ratio is a metric used to assess the risk-adjusted performance of an investment. Calculated as the difference between the returns of the investment and the risk-free rate, divided by the investment’s standard deviation, the Sharpe ratio effectively communicates how much additional return an investor is receiving for the extra volatility endured for holding a riskier asset. Higher Sharpe ratios indicate more desirable investments.

To stay consistent, we will need to calculate the logarithm of the annual risk-free rate and divide it by 252 trading days to get the daily_risk_free_rate. See the code below for clarification:

import math
annual_risk_free_rate = 0.02
daily_risk_free_rate = math.log(1 + annual_risk_free_rate) / 252
sharpe_ratios = {}
for i, df in enumerate(dataframes):
   excess_return = df['Daily Return']-daily_risk_free_rate
   sharpe_ratio = excess_return.mean() / df['Daily Return'].std()
   sharpe_ratios[tickers_array[i]] = sharpe_ratio
for ticker, ratio in sharpe_ratios.items():
   print(f'Sharpe Ratio for {ticker}: {ratio}')

Covariance matrix

A covariance matrix expresses how different assets in a portfolio move in relation to each other. Understanding these relationships is key to diversification since it allows investors to offset potential risks by combining assets that do not move in tandem. Portfolio optimization uses this matrix to minimize overall risk for a given level of expected return.

To calculate a covariance matrix, you can use the daily_returns.cov() method of the pandas package.

Portfolio weights

Weights in portfolio optimization signify the proportion of the total portfolio value allocated to each asset or investment. Adjusting these weights is how we manage the risk-return profile of the portfolio. The optimization process helps us find the set of weights that optimizes the portfolio’s performance based on our investment criteria. In Python, we can store the weight in an array, for example: weights = [0.2, 0.2, 0.2, 0.2, 0.2], meaning each asset in the portfolio is equally weighted. The sum of portfolio weights should be 1 if we want full portfolio allocation.

Overview of most popular portfolio optimization strategies

Below you can find an overview of several popular portfolio optimization strategies alongside their pros and cons.

Сlassic mean-variance optimization (MVO)Seeks to maximize return for a given risk level and promotes diversification.High sensitivity to errors in estimation of expected returns; unstable portfolio weights.
Individual asset Sharpe ratio optimizationPromotes focus on risk-adjusted returns; simple and quantifiable.Relies on historical data; ignores asset correlations; may exclude potentially beneficial assets.
Black-Litterman modelCombines investors’ views with market equilibrium; robust to input estimation errors.Complexity in deriving investor’s view of returns; relies on the assumption of normal distributions.
Risk parityBetter risk distribution; less sensitive to return predictions.Assumes assets’ risks (volatility) are a key factor in returns; inappropriate during non-trending, volatile markets.
Genetic algorithms/Monte Carlo simulationsCan optimize complex objective functions; not restricted by assumptions of returns distribution.Computationally intensive; requires careful calibration and good programming knowledge.
LS optimizer (least squares optimization)Minimizes tracking error; considers correlations of assets.Limited application outside tracking; requires consistent correlation.

Mean-variance optimization with Python

In this article we will go into detail of the mean-variance optimization method.

View the notebook

The main goal of the mean-variance optimization (MVO) method is to find the most efficient portfolio. An “efficient” portfolio is defined as a portfolio with the maximum return for a given level of risk.

The MVO method considers the return and variance (or standard deviation, a measure for risk) of each potential investment in a portfolio. It also accounts for the covariance between different investments, which is a measure of how their returns move together.

Step 1: Get portfolio metrics 

We need to create a get_portfolio_metrics Python function to get the portfolio_return, portfolio_volatility, and sharpe_ratio for a given portfolio. Here’s what this function does:

  • It takes in the asset weights, returns, and covariance as inputs.
  • First, it finds the portfolio return by multiplying the return of each asset by its weight, and adding these together.
  • It then calculates the portfolio volatility, which is a measure of risk based on how different assets in the portfolio move together.
  • Finally, it works out the Sharpe ratio. This is done by dividing the portfolio return by the portfolio volatility.
  • The function returns these three metrics, which can be used to understand and optimize the portfolio’s performance.

import numpy as np
def get_portfolio_metrics(weights, returns, covariance):
   weights = np.array(weights)
   portfolio_return = np.sum(returns * weights)
   portfolio_volatility = np.sqrt(,, weights)))
   sharpe_ratio = (portfolio_return - daily_risk_free_rate) / portfolio_volatility
   return portfolio_return, portfolio_volatility, sharpe_ratio

Step 2: Create an optimization function

In Python, libraries like SciPy have optimization functions that aim to find the minimum value of a given function. To use these for maximizing the Sharpe ratio (since higher values are better), we take the negative of the Sharpe ratio.

def objective(weights, returns, covariance):
   return -get_portfolio_metrics(weights, returns, covariance)[2]

We then need to set constraints for optimization. To find out what each constraint means, you can check this notebook.

# Number of assets in portfolio
num_assets = len(daily_returns.columns)
 # Initial guess for weights (equal distribution)
guess = num_assets*[1./num_assets]
constraints = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1})
 # Bounds for weights. Each weight can be a value between 0 and 1 (inclusive)
bounds = tuple((0, 1) for asset in range(num_assets))

Then, we use the minimize function of the scipy.optimize module to find a portfolio with the biggest Sharpe ratio (in our case minimal negative Sharpe ratio).

from scipy.optimize import minimize
optimal_results = minimize(objective, guess, args=(daily_returns.mean(), daily_returns.cov()), method='SLSQP', bounds=bounds, constraints=constraints)
# Get optimal weights
optimal_weights = optimal_results.x
# Print optimal weights
print("Optimal weights: ", optimal_weights)

Step 3: Calculate portfolio metrics and plot the results

To return key portfolio metrics, we will once again use the get_portfolio_metrics function.

port_return, port_volatility, sharpe_ratio = get_portfolio_metrics(optimal_weights, daily_returns.mean(), daily_returns.cov())

To find the exact code required for plotting, we recommend opening the notebook. If you’re new to Python plotting libraries, we recommend using AI Assistant to help you with this part. Check out the next section for suggested prompts.

AI prompt cheat sheet to implement optimization strategies with Python

Fetching stock price data from the Yahoo Finance APIWrite a Python script to fetch historical stock prices from Yahoo Finance using the “yfinance” package for this {ticker list} between {start date} and {end date}. Save the result to the {dataframes} DataFrame.
Calculating logarithmic returnsCreate a Python script to calculate logarithmic returns for the “Adj Close” prices for all tickers present in the {dataframes}  DataFrame, and add the log returns to the existing DataFrame.
Plotting correlation matrix of log returnsPlot a correlation matrix of daily/log returns using plotly.
Implementing {X strategy}Implement the {X strategy} in Python for the provided {dataframes}. Take into account the annual risk-free rate of 0.02, convert it to a daily rate, and use the logarithm of the annual risk-free rate for the calculations.
Visualizing asset allocationVisualize new asset allocation using plotly.
Visualizing cumulative returnsVisualize portfolio’s cumulative returns on a line plot using plotly.

Note: The prompts were tested using Datalore’s Ask AI feature. The benefit of using Datalore for Python code generation is that Datalore’s AI Assistant takes into account the context of the notebook and reduces the amount of prompt fine-tuning and nuancing you’d usually need to do inside other AI chats. What’s more, you also get instant results and visualizations right after running the generated cell.

Presenting the results of your work

To help you share the results of your work with stakeholders, you can leverage Datalore’s Report builder. Select certain visuals and notebook cells, arrange them on the canvas, and publish and share your work as a static or interactive report. Your stakeholders will be able to view and interact with your work and get the results recomputed on the fly. Take a look at an example report here.

You can try Datalore online for your team on the Team plan, or explore our Enterprise version for working with sensitive data. However, please note that the Ask AI feature is currently only supported as part of the Team plan.

Get a demo

Did you find this article helpful? If so, please share it with your colleagues. Let us know in the comments below if you have any suggestions as to how we can improve it.

image description