Python Basics: Using sets to compare keymaps

If you’ve ever visited JetBrains at a conference, you know that we always have handouts with keymaps for our products. You might also know that we have different keymaps for Windows/Linux and for Mac due to the different keyboard layouts. As sometimes people grab the wrong one and find out too late, they can’t use the keymap. Therefore we’ve decided to unify the keymaps into a single keymap.

Unfortunately, with a single keymap, we have less space than with separate keymaps so we will need to select a subset of hotkeys that we want to keep. As WebStorm recently already did this, let’s have a look at theirs to compare. It’d be helpful to find out which keys they selected, and more importantly, which keys they left out. To do this, let’s write a quick script.

If you’d like to follow along with the script, you can always have a look at the code on GitHub.

Python has built-in set operations so we can use these to do the hard work later. However, in order to use these, we have to parse our data. Our source materials are two CSV files that look like this:

So let’s define an object that can represent a hotkey:

Using sets

In this script, we’re going to use sets to compare the keymaps. Sets are meant for collections of unique objects where the order of the objects doesn’t matter. Hotkeys are a great example of these: they need to be unique (can’t bind two actions to a single key combination) and the order doesn’t matter (it doesn’t matter if we define Ctrl+Shift+A first, or Ctrl+Alt+O).

However, as a set should only contain unique objects, a set will check whether you already have an object that matches the current object upon insertion. We’d like to have hotkeys matched on the keystroke rather than the description (which may differ slightly), so we need to define the appropriate overrides (use Ctrl+O to see them all) on our Hotkey class:

  • __eq__ which returns whether or not objects are equal
  • __ne__ which returns whether or not objects are not equal
  • __hash__ which should return a number that represents the object, this number should be the same for objects that are equal to each other, but ideally different for objects that are not equal to each other

For our class this will look like:

Our hash function here uses the recommended approach from the Python documentation: return a hash of a tuple of the objects that contribute to the equality check.

Parsing, first pass

Let’s start with a first, naive approach to parsing the files. We will expand this code later to deal with edge cases as they arise. To begin, let’s only parse the PyCharm file:

Exploring Results with the Debugger

Let’s run the code, and then have a look at the objects we created using the PyCharm debugger. Let’s first put a breakpoint at the last line of the main method (pycharm_hotkeys.add(to_add)). Then, if we right click anywhere in our code, we can then select ‘Debug’:

debug code

And now in the debugger, we see this:

debugger

Wouldn’t it be nice to see something clearer than <__main__.Hotkey object at 0x0000015A02D377B8> here? Well, thankfully that is quite easy: we just have to add an override for the __repr__ method of our object.

Now when we start to debug our code again, we get a much better result:

debugger 2

And we can immediately spot an issue: the key is incomplete: It is supposed to be “<Ctrl + Space> Basic code completion (the name of any class, method or variable)”. What’s happening?

This CSV file was saved using Excel, and we’re seeing an encoding issue, if you look at the ‘item’ line in the debugger, you can see that several strange characters occur before ‘Ctrl’. And if we look at the csvfile line, we can see that it’s attempting to decode cp1252, even though the file is UTF-8 with BOM. After a quick look in the Python docs, we find that we need to specify encoding='utf-8-sig' to the open method.

Handling special cases

At this point when we look at the keymap, and the results we’re getting in the debugger, we can see that there are a couple of hotkeys which are going to be hard to parse:

Ctrl +X, Shift + Delete - Cut current line or selected block to clipboard

Alt + F7 / Ctrl + F7 - Find usages / Find usages in file

These are either alternative hotkeys for the same action (indicated with the comma) or two related hotkeys on a single line (with the slash). As there are many of these on the WebStorm keymap, we’ll need to deal with them. Let’s first refactor our Hotkey parse code into a function. To help PyCharm, let’s first extract all of our code in the main clause into a main function. To do this select everything in the if __name__ == ‘__main__’ clause, and use Ctrl + Alt + M to extract the method.

Then, let’s select the lines from # Define the key to add until to_add.action = item[1], and use Ctrl + Alt + M again. Let’s name this function ‘parse_hotkey’. Let’s also manually rework the arguments to be a little more readable: keystroke and action instead of just a list item:

Now we can change our parse_hotkey method to return a list of hotkeys rather than a single hotkey: return [to_add], and change pycharm_hotkeys.add(to_add) to pycharm_hotkeys.update(to_add).

At this point, we can add code to handle the special cases in the parse_hotkey method. For the sake of brevity (hah), I’ll omit telling more about handing special cases, and refactoring the CSV reading method to easily read both files. You can see the full code on GitHub if you’d like. A couple of highlights:

Set Operations

At this point, we have two sets: pycharm_hotkeys, and webstorm_hotkeys. Now we can use several methods to analyze these sets:

  • intersection: which elements are in both sets? In this case: which hotkeys are both on the PyCharm, and on the WebStorm keymap?
  • difference: which elements are in set A, but not in set B (one way around)
  • symmetric_difference: which elements are in set A, but not in set B; but also which elements are in set B, but not in set A (both ways)

In our case, I’m actually interested in the differences both ways, but I care about which way around it is. This leaves my main method looking like:

When we run this code, we get the results we’re interested in:

If you run the code you may notice that the keys appear in a different order every time, which is a result of sets being unordered.

To learn more about sets in Python, check out the Python set documentation.

This entry was posted in Tutorial and tagged . Bookmark the permalink.

3 Responses to Python Basics: Using sets to compare keymaps

  1. Brian O says:

    >> Unfortunately, with a single keymap, we have less space than with separate keymaps so we will need to select a subset of hotkeys that we want to keep.
    <<
    If this means that some key combinations will no longer be available, this is a bad idea —
    please don't do it. No user will ever complain of having too many hotkey possibilities, and any user will hate losing a favorite hotkey.

    • Ernst Haagsman says:

      Don’t worry! This is just about printed keymaps that we hand out at conferences. We are not removing any of our hotkeys. Furthermore, you can add or change hotkeys in Settings | Keymap. Also, if you’re using the default Keymap, there are more hotkeys on the keymap reference that you can find in Help | Keymap Reference.

  2. Pingback: Import Python: ImportPython Issue 124 – Python Packaging, Algorithms, easter eggs, machine learning and more | Adrian Tudor Web Designer and Programmer

Leave a Reply

Your email address will not be published. Required fields are marked *