Feedback Request: Limitations on Data Classes

While M13 is approaching, we are planning a little ahead. This is a request for feedback on some future changes in Kotlin.

We want to deliver Kotlin 1.0 rather sooner than later, and this makes us postpone some design choices we don’t have enough confidence about. Today let’s discuss data classes.


The concept of data classes has proven very useful when it comes to simply storing data. All you need is say:

and you get equals()/hashCode(), toString(), copy() and component functions for free.

The most common use case works like a charm, but interaction of data classes with other language features may lead to surprising results.


For example, what if I want to extend a data class? What if the derived class is also a data class?

Now, how does equals() or copy() work in Derived? All the well-known issues arise at once:

  • should an instance of Base be equal to an instance of Derived if they have the same values for a and b?
  • what about transitivity of equals()?
  • what if I copy an instance of Derived through a reference of type Base?

And what about component functions that enable multi-declarations? It seems more or less logical that c simply becomes the third component in Derived in this basic case:

But nothing prevents us from writing something like this:

Note that the parameter order is reversed: first b, than a. Now it’s not that clear any more. And it may get worse:

Now c comes first, and the inherited component1(): A is simply a conflict, it is not an override, but such an overload is not legal either.

And these are only some examples, there’re many more issues, big and small.

Our strategy

On the one hand, we are not sure whether there is an elegant design for inheritance involving data classes. We have some sketches, but none of them looks promising enough.

On the other hand, we want to finalize the language design now, to be able to ship 1.0.

So, we decided to restrict data classes quite a bit to rule out all the problematic cases in 1.0, so that we can get back to them later and maybe lift some of the restrictions.

Proposed restrictions

We are going to do the following:

  • allow to inherit data classes from interfaces
  • forbid to inherit data classes from other classes
  • forbid open data classes (i.e. other classes can not extend data classes)
  • forbid inner data classes (not clear how equals()/hashCode() should treat the outer reference)
  • allow local data classes (the closure is not structured, so it’s OK for equals()/hashCode() to ignore it)
  • require val/var on all primary constructor parameters for data classes
  • require at least one primary constructor parameter for data classes
  • allow private primary constructor parameters for data classes
  • var’s are as good as val’s in all respects (they participate in equals()/hashCode() etc)
  • forbid varargs in primary constructor parameters for data classes

Again, some of the restrictions in this list may be lifted later, but for now we don’t want to deal with these cases.

Appendix. Comparing arrays

It’s a long-standing well-known issue on the JVM: equals() works differently for arrays and collections. Collections are compared structurally, while arrays are not, equals() for them simply resorts to referential equality: this === other.

Currently, Kotlin data classes are ill-behaved with respect to this issue:

  • if you declare a component to be an array, it will be compared structurally,
  • but if it is a multidimensional array (array of arrays), the subarrays will be compared referentially (through equals() on arrays),
  • and if the declared type of a component is Any or T, but at runtime it happens to be an array, equals() will be called too.

This behavior is inconsistent, and we decided to fix it following the path of least resistance:

  • arrays are always compared using equals(), as all other objects

So, whenever you say

  • arr1 == arr2
  • arr in setOfArrays
  • DataClass(arr1) == DataClass(arr2)
  • or anything else along these lines,

you get the arrays compared through equals(), i.e. referentially.

We’d love to fix the inconsistency with collections, but the only sane way of fixing it seems to be fixing it in Java first, which is beyond anybody’s power, AFAIK :)

Call for feedback

Please share your opinion on the proposed changes. We are more or less sure about arrays, and pretty confident about limitations on data classes too, but it’s always a good idea to double-check with a wider range of use cases.

Thanks for your help!

About Andrey Breslav

Andrey is the lead language designer of Kotlin at JetBrains.
This entry was posted in Language design. Bookmark the permalink.

47 Responses to Feedback Request: Limitations on Data Classes

  1. LGTM!

    I’d also suggest you to look at AutoValue’s README to their design principles, AutoValue brings immutable value classes to Java with help of annotation processing.

  2. Christian says:

    require val/var on all primary constructor parameters for data classes

    Why? Normal parameters can be pretty useful. I’d love this to be kept allowed.

    • I’m not saying they are not useful. I’m saying that we are not ready to decide on the intricacies that they bring at the moment.

      • Paul says:

        Why can’t we forbid to reorder parameters in data classes? It can be checked in compiletime.

        • We can. It’s a tiny part of a possible design. There are too many such parts for us to be confident about arranging them the right way under the time pressure.

          • Paul says:

            One else semi-obvious decision is just skip parent constructor argemunts in child cpnstructor while data class inheritance.

            Thing that frustrates me is that we loose beatiful hibernate inheritance things :(

  3. Rob Bygrave says:

    “Composition” of data classes is still available right. So inheritance is not an option but composition was probably my preferred option anyway. Yes, happy with those restrictions on data classes for my use cases.

    var’s are as good as val’s in all respects

    So with var’s I’d expect the hashCode() value can change. Mutable data classes are very handy so I’m happy there but that is going to catch the uninitiated with use in Set’s etc. No difference to Java here but good documentation on var’s / changing hashCode() values might be good.

    arrays are always compared using equals()

    My 2c says “fine by me”. For me the need/use of “structural equals for array” is extremely rare (for what it’s worth I don’t remember fussing on this in 17 years of Java coding) so for me this is fine.

  4. Mikael Gueck says:

    Is it a priority to be able to use Kotlin 1.0 easily with JPA and Spring Boot?

      • Mikael Gueck says:

        The continuing lack of Serializable in various Kotlin classes typically used in JPA @Id fields is a major roadblock in this kind of basic usage.

        • We are working on this. Unfortunately, it turned out to be a lot more work than we anticipated, but we’ll finish it by 1.0

          • Peter Niederwieser says:

            How about using Kotlin with JUnit? Not being able to declare public instance fields (correct me if I’m wrong, but I couldn’t find a way) means missing out on one of JUnit’s most important features (@Rule). To work around this, I had to introduce Java base classes for my tests. JUnit isn’t the only library with this requirement; without a way to declare public instance fields, the Java interop story isn’t complete.

        • Rob Bygrave says:

          various Kotlin classes typically used in JPA @Id fields

          I’m curious. What Kotlin classes are you using in JPA @Id fields?

        • Rob Bygrave says:

          So I had a quick look and it is not clear to me. The branches seem to be swapping between @Id of Long and Key and both should work fine and both of those types are nothing to do with Kotlin per say so I’m missing your point/issue.

          Certainly there are no issues with Kotlin and Ebean ORM (which uses JPA mapping and entity bean enhancement that would be similar to Eclipselink).

          It seems that you are expecting to use Kotlin data classes as JPA entity beans which is interesting. As the author of Ebean ORM I’m not going to be recommending that approach to anyone as most commonly it is good practice to use inheritance with entity beans and have a @MappedSuperclass bean with common properties such as @Id, @Version, @WhoCreated, @WhoModified, @WhenCreated, @WhenModified etc.

          Data classes implement hashCode()/equals() so that could conflict with JPA vendor enhancement/weaving (when using data classes as @EmbddedId for example).

          Sorry, probably no helpful comments there.
          Cheers, Rob.

  5. Adel says:

    so basically we should avoid Arrays in data classes or provide our own equals() every time?

    • I’d say you should avoid arrays everywhere (including your pure Java projects) unless you are doing some low-level optimizations.

      And yes, if you want structural equality for arrays anywhere (including pure Java code), you have to provide custom implementations for equals()/hashCode()

  6. M Platvoet says:

    For now this restrictions seems logical to me. Rather restrictive now and expand later then the other way around.

    I believe equals (and hashcode) should work something along the lines like this
    example. So a data class should only equal the exact same type, everything else should be nonsense. But you could use propertiesEquals to test properties of unequal types.

  7. Peter Gromov says:

    Maybe I’m too big a fan of purity, but allowing mutable properties to participate in equals/hashCode by default sounds like a heresy to me. Such a nice opportunity to shoot oneself in the foot! OTOH silently excluding them from equals/hashCode would probably also be unexpected by the code authors :(

    • This is exactly the concern we had, but simply disallowing var’s would be overly restrictive in some use cases. This is a discouraged practice, as having vars anywhere else, basically.

      • Gordon Tyler says:

        I too am disturbed by the prospect of mutable vars in a data class.

        I was wondering if it would be possible to declare a data class as mutable or immutable and then assert on the use of mutable data classes in places whether they would cause problems, for example anywhere that depends on a stable hashCode.

        • I think the best we can do for you is have an opt-in inspection that would warn you on the declaration site that var‘s in data classes require careful treatment

        • Rob Bygrave says:

          anywhere that depends on a stable hashCode.

          I think the issue here is that we only know this (places where we want stable hashCode) based on knowing how specific implementations (HashMap, HashSet etc) actually work – hence Andrey’s answer of an inspection warning.

  8. +1. I think every one of the data class restrictions is completely reasonable.

    I throw up my hands on arrays because that’s a pretty messed up situation to begin with.

  9. Rodrigo Quesada says:

    I mostly like the data annotation (I know it’s a modifier, but I’m gonna call it like that here) because it allows me to have equals/hashCode/toString methods automatically generated, I don’t care too much about the other functions, though.

    I think a good way of making everyone happy would be to allow for various configurations on the data annotation, kind of like project Lombok ( allows you to configure certain annotations and fine-tune what you get in Java (such as the callSuper option of the EqualsAndHashCode annotation

    The only issue with that would be that you could get pretty lengthy data class declarations when specifying multiple options, so it would be definitely great that besides that you also provide a way to create custom data annotations (that would be allowing to annotate your annotations with the data annotation or being able to alias it on a per-module basis), that way you get your own data annotations that behave exactly as you want/prefer. Something like this maybe:

    data(sameClassEquals=true, transitive=true) class Foo(val a: A, var b: B)

    Regarding the restriction of only being able to use val for data classes, I’m glad you will not be doing that, I definitely like freedom when modeling my classes (such as aggregates/entities when using DDD).

    Also, something else I would like to add is that I would certainly love being able to annotate non-primary-constructor properties and get them included in equals/hashCode/toString calculation for the data class. Something like what I commented on this ticket:

    class A() {
    data val a: Int
    get() = 123*321 //I don't know, do something cooler here instead


    In this particular edge case (i.e. a class with data annotations only on its properties), though, you would not get component or copy methods generated, but you would get the rest for free (which I would certainly hate having to implement manually).

  10. Oliver Plohmann says:

    I think it’s a good idea you are aiming for delivering Kotlin 1.0 rather sooner than later. One more year and Scala gets more and more ground and people waiting for Kotlin might get tired of waiting. Once KT-3029 is implemented real Kotlin life will start :-).

    • I don’t think KT-3029 will be implemented per se. Most likely, we’ll just forbid protected in interfaces on the JVM

      • Oliver Plohmann says:

        Do you mean it won’t be implemented for Kotlin 1.0 or do you mean it won’t be implemented at all? Thanks.

        • Since Java does not allow protected in interfaces on the class file level, there’s no way to implement this properly on the JVM. So, unless we find a really clever trick, it won’t happen in any version of Kotlin/JVM.

          • Oliver Plohmann says:

            Traits may have protected vars and methods in Scala. I just tried it out to be sure. Think also Ceylon has that. So they did find some trick. I really hope Kotlin will have protected methods in traits. This is for modelling purposes very important to make sure encapsulation of the class extending the trait is not broken. Really hoping it can be done :-).

          • Alexander Udalov says:

            Protected vars and methods in Scala are compiled down to public methods, which is unfortunately not very “protected”.

          • Oliver Plohmann says:

            @Alexander: Thanks for sharing this. A trait might be in a different package than the class extending it. So methods/vars in a trait have to be public. Painful situation, really ;-).

  11. Those data class restrictions seem reasonable. Personally, I don’t think I’ll ever use vars on mine, but I don’t mind you leaving that possibility open.

  12. Karl Leopold says:

    For beginners, it may be useful to get a compiler warning if they use data classes with var properties as keys in maps or sets.

    I love these restrictions, because they make data classes very easy and intuitive to use. For more complex cases, there are still regular classes.

    Array handling sounds good.

  13. Michael Rush says:

    We’ve got about 60 data classes in our current codebase and all conform to the proposed restrictions. I guess this is because they were created according to the “spirit” of how data classes are intended to be used. The restrictions sound reasonable to me. Looking forward to 1.0!

  14. Jon Schneider says:

    Final-by-default is the most inconvenient feature of Kotlin to me at the moment. Trying to pair Spring with Kotlin, I often find myself opening classes and methods as I add Spring features to allow Spring to generate proxies as necessary. Final-without-recourse really scares me.

    The status quo feels like being engrossed in a great film only to see the sound man suddenly walk into the scene. It is jarring, and shatters the magic of the moment.

    Is it possible to make final-by-default a compile-time only constraint? After all, in the case of a Spring RestController class that I need to open for Spring to do its magic, I have no intention of ever inheriting from the type and would be perfectly happy to have the compiler enforce this so long as the final modifier didn’t make it into the bytecodes.

  15. Chris Kent says:

    I know I’m late to the party but I was on holiday and only just saw this post.

    In my day job I have to use primitive arrays quite often (unfortunately). If data classes compare arrays by reference it would make them a lot less useful for cases like mine. I would have to wrap them in a type that correctly implements hashCode() and equals() or manually write hashCode() and equals() methods for any data classes containing arrays. That would remove a lot of the benefits of using a data class.

    I think it would be better for data classes to use Arrays.deepEquals() for comparing arrays. Using reference equality will inevitably lead to subtle bugs. The caller would have to check the types of all the fields in the data class to know how equality works for that particular class.

    I understand that arrays aren’t very popular, and for good reason, but when you need them there is nothing else that will do. It would be a shame if data classes were broken WRT arrays when it shouldn’t be hard to support them.

    I don’t think it’s a problem that the equals() method of a data class containing an array won’t have the same behaviour as using == on two array references. If you’re using arrays you already need to be aware of the pitfalls and you have the option of using Arrays.deepEquals() to get the correct behaviour. But if reference equality checking is baked into data classes there is no workaround.

    • I understand that arrays aren’t very popular, and for good reason, but when you need them there is nothing else that will do. It would be a shame if data classes were broken WRT arrays when it shouldn’t be hard to support them.

      I can’t agree. If in a rare case you can’t use a data class and have to resort to manually implementing equals, it’s not very much a problem, IMO.

      • Chris Kent says:

        What are the downsides of supporting arrays properly in data classes?

        • Inconsistency and unpredictability. It’s better to learn that arrays are compared by identity everywhere, once and for all than debug weird behaviours with special-case behaviours. Scala has been there, and it’s very hard

  16. Zdenek says:

    Allowing data classes to inherit from interface with combination of non-val/var parameters – Does it allow you something more than using tag interfaces on data classes?

    The reason is why I’m asking is this simle case:

    interface Pet {
    val name: String

    // I'm lost here as I cannot have non-val/var in constructor and cannot override
    data class Puppy(name: String) : Pet {
    override val name = name

    Did I miss something or you simply cannot make use of “reasonable” interface usage?

Comments are closed.