Kotlin has a lot of great features like null safety, intelligent type conversion, string insertion, and more. But I’ve found that the Data types that developers like the most are probably Data classes, which are today’s main topic. Data classes are so popular that you often see them used where they are not needed.

In this article, we will use experiments to better understand how the heavy use of data classes affects package sizes. The way to experiment is to delete all data classes without making the compilation pass. Finally, the results of the experiment will be shared with you. It doesn’t matter that the application gets broken during the experiment, because we only care about the effect of the data class on the size of the application package.

Data class functionality

During development, we often need to create classes to store data. In Kotlin we can declare data classes and get some additional functionality as well.

  • Component1 (), component2 ()… ComponentX (), which can parse the assignment as (val (name, age) = person);
  • Copy (), which automatically creates the copy function;
  • ToString (), which concatenates the class name with all its members;
  • The equals () and hashCode ().

We get a lot of extra functionality, but at what cost? Optimizers such as R8, Proguard, DexGuard, etc., may be used when building distributions. These optimizers remove unused methods during the build process, which means they optimize data classes.

Here’s a list of what will be removed:

  • Component1 (), component2 ()… ComponentX (), subject to the condition that destructuring assignment is not used (but, even if destructuring assignment is present, in the case of more aggressive optimisation settings, these methods can be replaced with direct reference to the class field);
  • Copy () is deleted if it is not used.

Here are the methods that will not be deleted:

  • ToString (), where the optimizer doesn’t know where this method might be called (e.g., log printing), and similar methods are not confused;
  • Equal () & hashCode(). If both methods are removed, the functionality of the APP may be affected.

So toString(), equals(), and hashCode() are usually kept in the build package of the release.

Scale of the Changes

In order to measure the impact of application-level data classes on the APP, we propose an assumption: not all data classes are necessary, and some can be replaced with ordinary classes. Because the release version of the APP optimizes the componentX() and copy() methods for the data class, converting the data class to a regular class can be condensed as follows:

data class SomeClass(val text: String) {
- override fun toString(a)=... -override fun hashCode(a)=... -override fun equals(a)=... }Copy the code

This cannot be done manually. The only way to do this is to redefine the data classes in the project as follows:

data class SomeClass(val text: String) {
+ override fun toString(a) = super.toString()
+ override fun hashCode(a) = super.hashCode()
+ override fun equals(a) = super.equals()
}
Copy the code

Manually process 7749 data classes in the project.

A warehouse of APP has been developed to this point. I don’t know which of the 7749 data classes should be modified to measure the impact of data classes on an APP. So I decided to change everything.

Compile the plug-in

It was obviously impractical to modify such a large file manually, so I came up with the idea of compiling plug-ins — a good way to do it, even though there was no documentation for it yet. Fixing Serialization of Kotlin Objects onece and for All But this plug-in is used to generate methods, and what we need now is to delete methods.

We found a free plugin on Github called Sekret that can hide attributes in toString() methods with annotations. So let’s do it based on this plugin.

From the point of view of creating the project structure, nothing has changed in the way plug-ins are used. Here’s what we need:

  • Gradle Plugin is easy to integrate.
  • Compiler plug-in, connected by Gradle plug-in;
  • A sample project that can run a variety of tests.

The most important part of the Gradle plugin is the KotlinGradleSubplugin declaration. This child plug-in is connected via SeriviceLocator. With the basic Gradle plugin we can configure the KotlinGradleSubplugin so that we can configure the behavior of the compiler plug-in.

@AutoService(KotlinGradleSubplugin::class)
class DataClassNoStringGradleSubplugin : KotlinGradleSubplugin<AbstractCompile> {

    // Check if the main Gradle plugin is applied
    override fun isApplicable(project: Project, task: AbstractCompile): Boolean =
        project.plugins.hasPlugin(DataClassNoStringPlugin::class.java)

    override fun apply(
        project: Project,
        kotlinCompile: AbstractCompile,
        javaCompile: AbstractCompile? , variantData:Any? , androidProjectHandler:Any? , kotlinCompilation:KotlinCompilation<KotlinCommonOptions>?: List<SubpluginOption> {
        // Compiler plugin options are defined with help of DataClassNoStringExtension
        val extension =
            project
                .extensions
                .findByType(DataClassNoStringExtension::class.java) ? : DataClassNoStringExtension()val enabled = SubpluginOption("enabled", extension.enabled.toString())

        return listOf(enabled)
    }

    override fun getCompilerPluginId(a): String = "data-class-no-string"

    // This is the compiler plugin artefact and should be available in any Maven repository specified in the target project
    override fun getPluginArtifact(a): SubpluginArtifact =
        SubpluginArtifact("com.cherryperry.nostrings"."kotlin-plugin"."1.0.0")}Copy the code

A plug-in compiler has two important components: ComponentRegistrar and CommandLineProcessor. The former is responsible for integrating the logic into our compile phase; The latter is used to process the parameters of the plug-in. I won’t describe them in detail here, but you can go to the warehouse to check the specific implementation DataClassNoString, here I want to explain, our usage is not the same with another article (article links, we will register ClassBuilderInterceptorExtension, Rather than ExpressionCodegenExtension

ClassBuilderInterceptorExtension.registerExtension(
    project = project,
    extension = DataClassNoStringClassGenerationInterceptor()
)
Copy the code

At this point, it is necessary to prevent the compiler from creating certain methods. To do this we use DelegatingClassBuilder, which delegates all calls to the original ClassBuilder call and also allows us to redefine the behavior of newMethod. If we try to create toString(),equals(), hashCode (), an empty MethodVisitor will be returned. The compiler will write code for these methods, but they will not be written to the created class.

class DataClassNoStringClassBuilder(
    val classBuilder: ClassBuilder
) : DelegatingClassBuilder() {

    override fun getDelegate(a): ClassBuilder = classBuilder

    override fun newMethod(
        origin: JvmDeclarationOrigin,
        access: Int,
        name: String,
        desc: String,
        signature: String? , exceptions:Array<out String>?: MethodVisitor {
        return when (name) {
            "toString"."hashCode"."equals" -> EmptyVisitor
            else -> super.newMethod(origin, access, name, desc, signature, exceptions)
        }
    }

    private object EmptyVisitor : MethodVisitor(Opcodes.ASM5)

}
Copy the code

Therefore, we interfere with the creation of the class and completely rule out the methods mentioned above. You can check the code in the sample project to see if these methods are excluded. You can also verify that none of these methods exist by examining the bytecode of the JAR /dex.

class AppTest {

    data class Sample(val text: String)

    @Test
    fun `toString method should return default string`(a) {
        val sample = Sample("test")
        assertEquals(
            "${sample.javaClass.name}@${Integer.toHexString(System.identityHashCode(sample))}",
            sample.toString()
        )
    }

    @Test
    fun `hashCode method should return identityHashCode`(a) {
        val sample = Sample("test")
        assertEquals(System.identityHashCode(sample), sample.hashCode())
    }

    @Test
    fun `equals method should return true only for itself`(a) {
        val sample = Sample("test")
        assertEquals(sample, sample)
        assertNotEquals(Sample("test"), sample)
    }

}
Copy the code

All the code can be found in the DataClassNoString sample repository, and you can also see how the plug-in is integrated.

The results of

For comparison purposes, we’ll use Bumble and Badoo distributions. The following results are passedDiffUseThis tool can output detailed information about the two APK files, such as dex and the size of the resource file, the number of lines of code, the number of methods and the number of class files.

Application Bumble Bumble (after) Diff Badoo Badoo (after) Diff
Data classes 4026 2894
DEX size (zipped) 12.4 the MiB 11.9 the MiB 510.1 KiB 15.3 the MiB 14.9 the MiB 454.1 KiB
DEX size (unzipped) 31.7 the MiB 30 MiB 1.6 the MiB 38.9 the MiB 37.6 the MiB 1.4 the MiB
Strings in DEX 188969 179197 – 9772. 244116 232114 – 12002.
Methods 292465 277475 – 14990. 354218 341779 – 12439.

The number of data classes is determined by analyzing the deleted strings in the dex file.

All implementations of the toString() method of a data class begin with the abbreviation of the class name, a parenthesis, and the first property of the data class. There is no data class without a property field.

From the results we can conclude that, on average, each data class takes 120 bytes for compression and 400 bytes for non-compression. At first glance it didn’t seem like much, so I decided to take a look at what kind of effect it could have across the entire application. Apparently, data classes accounted for about 4% of the dex file volume in the entire project.

It is also worth noting that since we are using an MVI architecture, we tend to use data classes more than other architectures, so the impact on your own application may not be as significant.

Use the Data of the Class

I’m not saying you should stop using data classes, but you need to consider all of these things when considering whether or not you should use data classes. Here are a few questions worth thinking about before using data classes:

  • Equals () and hashCode ()If necessary, consider using data classes, but don’t forgettoString()Not to be confused.
  • Whether you need multiple constructors, if that’s the only reason to use data classes is not the best choice.
  • toString()Is it necessary? The business logic is less dependent on the implementation of this method, which can sometimes be recreated by means of an IDE.
  • Do you need to send a simple DTO another layer, or do you need to save some configuration information? If the previous problem is irrelevant, a simple class will work for this scenario.

It was impossible to completely dispense with data classes in the project, and the above plug-ins broke the application. Methods are not removed to measure the impact of large data classes. In our case, the data class takes up 4% of the dex file size.

If you want to measure how much space your data classes are taking up in your project, you can do it yourself using our plugin. If you’ve done the same experiment, feel free to share your feedback!

The original link