Direct to the root cause of NPE (have been my own dish) :

Rx subscription is not cancelled, the Fragment has been reclaimed on the callback, reference view call update method, natural NPE.

Is this it? Yeah, it’s that simple basic error I’ve been checking all day, look, it’s common sense to cancel an RX subscription:

  • Either introduce lifecycle management;
  • Either define CompositeSubscription and clear() during Activity or Fragment destruction.
  • Either unsubscribe separately, RxJava2 uses unsubscribe(), RxJava1 uses Dispose ();

I thought the development of common sense, but the old project was hit in the face, and then I get a grasp of the cause and effect of things, hope you learn a lesson, less detour BUG troubleshooting, error reasons have been given, not interested in the process of troubleshooting can directly skip ~


0x1, another BUG is solved by accident

Yesterday morning at 11:30 +, open the pain video, ready to watch the drama while cooking, the result leader nail swing to a:

When I open it, I can see that the version that just went online has this error more than 2,000 times:

Open the detailed log to see:

NPE, null pointer exception, call the refresh component’s close refresh method reported null, check the other parts of the log, write your own script to confuse, see if you can get more helpful information.

It doesn’t work, but oddly enough, the number of bugs continues to increase without any user feedback.

The integration test didn’t find it in three rounds, and we can’t reproduce it in our own test.

After more log checks, one log stood out to me:

UnknownHostException: Unable to resolve host "xxx.xxx.com": No address associated with hostname
Copy the code

Then the NPE is reported, pointing to the finishRefresh() in the exception handler:

Em? Is there something wrong with the error-handling code? The testers are understaffed, miss the network exception boundary test, and probably don’t know how to simulate network exceptions and weak network situations.

Charles capture a packet to simulate a wave, locate the list interface, drop the breakpoint, and send the request directly to Abort:

Sure enough, the app crashed, and I was happy that I had identified the problem so quickly. Look at the error log:

There is no notifyDataSetChanged() after setEmptyView().

Dude, not solving one BUG, but solving another BUG, this… A blessing in disguise?


Is it really KAE’s pot?

The control is null. I think it might be the Kotlin-Android-Extensions (KAE). I’ve seen instances where the View instance is empty based on the ID. So before dry meals, I asked my friends in the group:

Everyone is very enthusiastic to persuade me not to use KAE, pit many, official recommended ViewBinding, handwritten findViewById stable, etc., these I know…

But where can talk to change, so the place used in the project, and another way, did not really solve the problem, at least to understand the cause of the problem…

KAE is how to let you get rid of findViewById, write a test project, write a TestActivity, which reference under a control, click Tools → Show Kotlin ByteCode → Decompile:

FindViewById (); find the control;

It is implemented in the same way as an Activity, with only two differences:

  • â‘  Call the Fragment getView() method for the layout (onCreateView returns the View);
  • Override the onDestroyView() method to clear the instance in the map.

Why, you might wonder, would you want to clear the Map in onDestroyView(), but not onDestory()?

A: Considering the fragmentation scenario, the details are as follows:

The replace () after the fragments will perform onDestoryView (), rather than perform onDestory () completely destroyed, aim is to destroy the View at the same time, keep the View state and members of the fragments, the next time you load can be directly go onCreateView (), Faster loading for reuse purposes.

On the state of View preservation mechanism, the author is not particularly understand, probably take a look at the TextView source (onSaveInstanceState and onRestoreInstanceState), saw the implementation of Parcelable rewrite some methods, guess is serialization. However, before and after serialization and deserialization, the object instances are not equal. At this time, the Map still retains the previous key and value pairs (ID → instance). At this time, the View instance obtained according to the ID must be wrong, so the operation is done here.

So there is no problem here, so it is not KAE’s pot, although it is not involved here, but also the Adapter situation how to findViewById also pass ~

KAE does not support direct use of adapter, you need to add the following experimental configuration to build. Gradle:

// To enable LayoutContainer androidExtensions {experimental = true}Copy the code

There are actually two ways to call. The first is this:

Look at bytecode to Java:

FindViewById directly, look at another way to have a ViewHolder implement a LayoutContainer:

Look at bytecode to Java:

The same principle is to create a hashMap to hold the reference, and bind the View passed in by the ViewHolder. If you really want to use KAE in the Adapter, use the second method.

Kotlin Android Extensions are deprecated, ViewBinding is officially recommended

In addition to space for time, an extra HashMap is used to store the View instance, and more importantly, this part of the content is black box to most users, sometimes stepping in some strange “pit”.

When you enable ViewBinding, AS automatically generates a Binding class for each layout file (including null). / build/generated/data_binding_base_class_source_out.

For details, see: Kotlin-Android-Extensions deprecates? Get me up


0x3 the moment of inspiration

After checking that the pot is not KAE, what causes the control to be empty? The investigation progress suddenly reached an impasse, can only start from the user behavior, friends do not know why can’t see the user’s behavior log.

Fortunately, there is a full burial site, open Kibana, filter the error type log, find the error log, get DevicEID, and then query user behavior.

By analyzing the errors reported by multiple users, I found a pattern:

It is a Loading of the home page and then crashes, and it is usually a long time since the user last opened the APP.

This is not because the APP has been recycled, restart the Activity to rebuild the problem, because the APP has a chicken chicken operation, as many apps, although the prompt “press again to exit the program”, but in fact to call moveTaskToBack() back to the background.

AS ran down the program, came to the page of the problem, APP back to the background, directly in Logcat to kill the program, and then reopen the program, wait for a moment, sure enough, crash, see the log information:

Good guy, indeed reproduced, because the Activity reconstruction caused by the refresh control is empty, crackling with the team leader to explain the reason for a wave of crash, and then emergency treatment is to call the first sentence empty, guarantee not to crash first.

It was almost time to get off work (6 o ‘clock), the normal situation should be to eat something to touch fish and so on, but did not understand the specific cause of this problem, home is never forget, simply overtime investigation.


0x4. Work overtime and check

When it comes to Activity reconstruction, it’s probably related to the Fragment lifecycle, multi-layer nesting, etc. Log lifecycle callbacks in both BaseActivity and BaseFragment.

Page three layers of nesting: CustomerFragment → ThirdAgentListFragment → CustomerChildNewFragment

Then simulate the crash, see the log output analysis:

When the Activity is rebuilt, the Fragment is restored, but the Fragment is quickly destroyed.

OnCreateView () → onViewCreated() → onActivityCreated() → various initialization operations

In this case, I’m just going to go onDestoryView() and onDestory(), and the reason this happens is actually replace, so let’s go back to the code:

Call the replace() method of the FragmentManager, while the normal two fragments go through the lifecycle (without calling addToBackStack) :

  • Fragment: onPause() → onStop() → onDestroyView() → onDestroy() → onDetach()
  • Replace Fragment: onAttach() → onCreate() → onCreateView() → onViewCreated() → onActivityCreated() → onStart() → onResume()

So, the actual logic here goes like this:

Create a Fragment → Create a new Fragment → Replace the Fragment → Restore the Fragment

In this case, the Fragment is replaced by a View instance. If the request is sent and the response is not returned, the Fragment is replaced by a View instance. In this case, drop the null pointer.

The savedInstanceState(Bundle) method determines if the parameter is null. If it is not, the request will not be loaded:

Of course, the best way to do this is to start with a network request. When an Activity or Fragment is destroyed, you need to cancel all rX subscriptions.

It’s been four or five years since the project started, but this BUG hasn’t been discovered.

Single-activity, multi-fragment play, no frequent replace() Fragment scenes, and most requests have uncancellations.

Check for a day, the original is such a simple BUG, predecessors dig pit, posterity fill pit, is really an old blood…

But in the process of screening also gained a lot:

  • Learn how KAE uses findViewById, so you can safely use it later.
  • ViewBinding;
  • Validation of the Fragment lifecycle (usually by rote);
  • Learn about the Activity reconstruction mechanism;

With that said, the road to solving bugs is blocked and advanced. I hope this article is helpful for your daily Debug locating errors. Thanks ~