SparseArray is a data container from Google. It is recommended to replace a HashMap whose key is int on Android. The reasons given in the official documentation are memory savings compared to HashMap: 1. Avoiding automatic boxing; 2. Data structures do not depend on external object mappings. Because for mobile, memory resources are more precious.
The source code parsing
Take android.util.SparseArray as an example, SDK=28
Member variables
private static final Object DELETED = new Object(); // flag whether value is deleted
private boolean mGarbage = false; // Marks whether garbage collection is currently taking place
private int[] mKeys; // An array of keys, in ascending order
private Object[] mValues; // Store an array of values
private int mSize; // The actual number of elements
Copy the code
Initialize the
The default SparseArray no-argument constructor has an initial capacity of 10, but is internally processed to 11. After initialization, both mKeys and mValues are unassigned arrays of length 11
public SparseArray(int initialCapacity) {
if (initialCapacity == 0) {
mKeys = EmptyArray.INT;
mValues = EmptyArray.OBJECT;
} else {
mValues = ArrayUtils.newUnpaddedObjectArray(initialCapacity);
mKeys = new int[mValues.length];
}
mSize = 0;
}
Copy the code
put
The most important method, the key and value stored procedure
public void put(int key, E value) {
// A binary search is performed for incoming keys, so mkeys must be in order
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
// If yes, the value of the original key already exists
if (i >= 0) {
mValues[i] = value;
} else {
// If I is not found, perform a reverse operation on I to get the index to be inserted
i = ~i;
// If the subscript to be inserted happens to be DELETED, update it directly
if (i < mSize && mValues[i] == DELETED) {
mKeys[i] = key;
mValues[i] = value;
return;
}
if (mGarbage && mSize >= mKeys.length) {
// If the capacity is insufficient and garbage collection is needed, the collection operation is performed first
gc();
// After the space is compressed, the array element changes and the insertion position needs to be recalculated
i = ~ContainerHelpers.binarySearch(mKeys, mSize, key);
}
// Insert elementsmKeys = GrowingArrayUtils.insert(mKeys, mSize, i, key); mValues = GrowingArrayUtils.insert(mValues, mSize, i, value); mSize++; }}Copy the code
Binary lookup and insert elements together ensure the order of mKeys. After determining the corresponding position of key in the array, the position of value in the array is also determined. Mkeys. length=11,mSize=0. When mSize increases to >= mkeys. length, gc will be performed first if there are DELETED elements in the array. Improves space utilization and avoids array expansion.
binarySearch
static int binarySearch(int[] array, int size, int value) {
int lo = 0;
int hi = size - 1;
while (lo <= hi) {
final int mid = (lo + hi) >>> 1;
final int midVal = array[mid];
if (midVal < value) {
lo = mid + 1;
} else if (midVal > value) {
hi = mid - 1;
} else {
return mid; // value found}}return ~lo; // value not present
}
Copy the code
BinarySearch, unlike Arrays#binarySearch, returns the inverse of the bottom boundary if the key is not found. If I invert it, I return a negative number, put goes down, and when I invert it again, I get where I want to insert it, and it’s still pretty efficient. Such as:
MKeys ={3,4,6,7,8}, lo=0,hi=4, lo=2,hi=1, lo=2,hi=1Copy the code
delete
public void delete(int key) {
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
if (i >= 0) {
if(mValues[i] ! = DELETED) { mValues[i] = DELETED; mGarbage =true; }}}Copy the code
The delete method is simple. If the key already exists and the corresponding value has not been marked for deletion, the value is marked for deletion and mGarbage is set to true. The method of marking is to point value to a static constant Object. Remove (int key) is also implemented by delete. If the value of the subscript is removed by the tag, it is updated directly, eliminating the need to delete and insert elements. It’s in the GC method that SparseArray actually handles mKeys and mValues
gc
private void gc(a) {
int n = mSize; // The number of elements before compression
int o = 0; // The number of compressed elements
int[] keys = mKeys;
Object[] values = mValues;
for (int i = 0; i < n; i++) {
Object val = values[i];
if(val ! = DELETED) {// If the value of the position is not marked for deletion
if(i ! = o) { keys[o] = keys[i];// Move the element at I forward to o so that all non-deleted elements are consecutively placed at the front of the array
values[o] = val;
values[i] = null; // Free up space
}
o++; // if I >o, the element is marked for deletion
}
}
mGarbage = false; // Compression (garbage collection) is complete
mSize = o; // Update the number of compressed elements
}
Copy the code
The DELETED nodes in the value array are reclaimed and the number of valid elements in the array is updated through a very clever algorithm. Looking back again put in operation, assuming that when mKeys =,3,4,5,6,7,8,9,10,11,12 {0}, mValues = {” a “, “b”, “c”, “d”, “e”, “f”, “g”, “h” and “I”, “j”, “k”}
SpA. Remove (4); //spA represents the SparseArray object mValues={"a"."b", *,"d"."e"."f"."g"."h"."i"."j"."k"} //* represents a DELETED marker. spA.put(15,"s") I = ~ ContainerHelpers. BinarySearch (mKeys, mSize, key) 11 and mGarbage mSize = = = 12 this timetrueHitting condition, call the gc method for the compression of array / / gc after mKeys =,3,5,6,7,8,9,10,11,12,12 {0} mValues = {"a"."b"."d"."e"."f"."g"."h"."i"."j"."k",null} mSize=10 i=~ContainerHelpers.binarySearch(mKeys, mSize, After the key) = 11 / / insert mKeys =,3,5,6,7,8,9,10,11,12,15 {0} mValues = {"a"."b"."d"."e"."f"."g"."h"."i"."j"."k"."s"}
mSize=11
Copy the code
Since the mKeys change after GC, it needs to be re-indexed, as shown above, I is equal to 12 before GC, and the final index after GC is 11. To note here that the gc process mKeys fill after a forward does not like mValues release space, such as the remove (4) and then remove (6), gc again after mKeys =,3,5,7,8,9,10,11,12,11,12 {0} will appear such circumstance, But that does not violate mKeys is orderly rules, because I = ~ ContainerHelpers. BinarySearch (mKeys, mSize, key); Lo =0,hi=size-1,hi is counted by the number of valid elements, like 11 and 12 are not valid key values, so binary search is still valid.
get
public E get(int key, E valueIfKeyNotFound) {
int i = ContainerHelpers.binarySearch(mKeys, mSize, key);
// Return the default value if the key is found in the mKeys array
if (i < 0 || mValues[i] == DELETED) {
return valueIfKeyNotFound;
} else {
return(E) mValues[i]; }}Copy the code
IndexOfValue, indexOfValueByValue
Retrieves the index of the key and value to determine if any element has been previously deleted by the tag. The difference is that key is a binary search, and value, because it is unordered, iterates through the array, and returns -1 if it is not found. So there’s a pit in there where indexOfValue internally uses == to compare addresses, and indexOfValueByValue compares two objects using equals, which is marked as private API. If you do want to determine the subscript by value, you can get it by traversing SparseArray and valueAt.
append
public void append(int key, E value) {
if(mSize ! =0 && key <= mKeys[mSize - 1]) {
put(key, value);
return;
}
if (mGarbage && mSize >= mKeys.length) {
gc();
}
mKeys = GrowingArrayUtils.append(mKeys, mSize, key);
mValues = GrowingArrayUtils.append(mValues, mSize, value);
mSize++;
}
Copy the code
If the array is empty and the key is larger than all the elements in the current mKeys, the search process is omitted. If there is no need to expand the array, append the value directly to the tail, and copy it once if necessary. This is the optimization made by Append for the special case of storage.
GrowingArrayUtils
Both insert and append are evaluated first to see if they need to be expanded, but inserts may need to be copied twice and end up calling System#arraycopy.
public static int growSize(int currentSize) {
return currentSize <= 4 ? 8 : currentSize * 2;
}
Copy the code
If the current size is greater than 4, it will be doubled.
Other containers for extension
SparseIntArray, SparseBooleanArray, and SparseLongArray are all data containers used to store specific values. There are no generics and no GC, and array types are all basic data types, avoiding boxing these value types and further optimizing. LongSparseArray is used to ensure that the key is of type long and can store a wider range of key data.
Performance analysis
Speed is
First use HashMap to do a 100000 forward insert:
public void testHashMap(View view) {
HashMap<Integer, String> map = new HashMap<>();
long start = System.currentTimeMillis();
for (int i = 0; i <= 100000; i++) {
map.put(i, String.valueOf(i));
}
long end = System.currentTimeMillis();
System.out.println("HashMap elapsed time:" + (end - start) );
}
Copy the code
Run 5 times to take the average and replace the HashMap with SparseArray. The result is as follows (time /ms)
The serial number | HashMap | SparseArray |
---|---|---|
1 | 75 | 66 |
2 | 72 | 73 |
3 | 64 | 80 |
4 | 46 | 63 |
5 | 74 | 68 |
On average, | 66.2 | 70 |
Let’s do another reverse insert
for (int i = 100000; i >= 0; i--) {
map.put(i, String.valueOf(i));
}
Copy the code
The result is no better. HashMap is about the same as forward insert, but SparseArray is more than 10 times slower. The reason for this is that SparseArray has to do a binary search every time, and in the worst case, the new key will be at the top of the array, so it will be very inefficient to copy the array every time.
Memory is
AndroidStudio Profiler generates 100,000 HashMap/SparseArray objects through a loop to see the difference in the Java Heap size before and after the generated objects
private HashMap<Integer, String> mMap = new HashMap<>();
private SparseArray<String> mSpa = new SparseArray<>();
public void test(a){
for (int i = 0; i < 100000; i++) {
mMap.put(i, String.valueOf(i));
//mSpa.put(i, String.valueOf(i));}}Copy the code
Adb shell Dumpsys meminfo
Applications Memory Usage (in Kilobytes):
Uptime: 854810329 Realtime: 2161500060
** MEMINFO in pid 16304 [com.example.dataset] **
Pss Private Private SwapPss Heap Heap Heap
Total Dirty Clean Dirty Size Alloc Free
------ ------ ------ ------ ------ ------ ------
Native Heap 5870 5836 0 215 22528 18214 4313
Dalvik Heap 0 0 0 0 4192 2096 2096
Stack 80 80 0 0
Ashmem 13 0 12 0
Gfx dev 392 392 0 0
Other dev 2 0 0 0
.so mmap 10058 416 5740 22
.apk mmap 289 0 0 0
.ttf mmap 54 0 0 0
.dex mmap 3656 12 1756 0
.oat mmap 387 0 28 0
.art mmap 7397 7000 20 102
Other mmap 19 4 0 0
EGL mtrack 20144 20144 0 0
GL mtrack 5896 5896 0 0
Unknown 6131 6088 0 54
TOTAL 60781 45868 7556 393 26720 20310 6409
App Summary
Pss(KB)
------
Java Heap: 7020
Native Heap: 5836
Code: 7952
Stack: 80
Graphics: 26432
Private Other: 6104
System: 7357
TOTAL: 60781 TOTAL SWAP PSS: 393
Copy the code
Memory usage after creating data using HashMap:
** MEMINFO inpid 16304 [com.example.dataset] ** Pss Private Private SwapPss Heap Heap Heap Total Dirty Clean Dirty Size Alloc Free ------ ------ ------ ------ ------ ------ ------ Native Heap 6414 6380 0 215 22528 18977 3550 Dalvik Heap 0 0 0 0 19243 9622 9621 Stack 88 88 0 0 Ashmem 13 0 12 0 Gfx dev 752 752 0 0 Other dev 2 0 0 0 .so mmap 10105 424 5740 22 .apk mmap 289 0 0 0 .ttf mmap 54 0 0 0 .dex mmap 3800 12 1828 0 .oat mmap 398 0 28 0 .art mmap 7592 7112 52 99 Other mmap 19 4 0 0 EGL mtrack 30216 30216 0 0 GL mtrack 6032 6032 0 0 Unknown 17420 17296 0 32 TOTAL 83562 68316 7660 368 41771 28599 13171 App Summary Pss(KB) ------ Java Heap: 7164 Native Heap: 6380 Code: 8032 Stack: 88 Graphics: 37000 Private Other: 17312 System: 7586 TOTAL: 83562 TOTAL SWAP PSS: 368Copy the code
Memory usage after creating data with SparseArray:
** MEMINFO in pid 16467 [com.example.dataset] **
Pss Private Private SwapPss Heap Heap Heap
Total Dirty Clean Dirty Size Alloc Free
------ ------ ------ ------ ------ ------ ------
Native Heap 6326 6288 0 214 22528 19060 3467
Dalvik Heap 0 0 0 0 10923 5462 5461
Stack 88 88 0 0
Ashmem 13 0 12 0
Gfx dev 756 756 0 0
Other dev 2 0 0 0
.so mmap 10108 424 5740 22
.apk mmap 299 0 20 0
.ttf mmap 54 0 0 0
.dex mmap 3737 12 1768 0
.oat mmap 398 0 28 0
.art mmap 7592 7112 52 99
Other mmap 19 4 0 0
EGL mtrack 30216 30216 0 0
GL mtrack 6044 6044 0 0
Unknown 10306 10180 0 32
TOTAL 76325 61124 7620 367 33451 24522 8928
App Summary
Pss(KB)
------
Java Heap: 7164
Native Heap: 6288
Code: 7992
Stack: 88
Graphics: 37016
Private Other: 10196
System: 7581
TOTAL: 76325 TOTAL SWAP PSS: 367
Copy the code
Using Dalvik heap-heap Alloc as a reference standard and using memory before and after multiple comparisons, SparseArray was calculated to reduce memory by 30-40% compared to HashMap.
conclusion
- SparseArray has less automatic boxing (int->Integer) than a HashMap, because a HashMap’s key stores the wrapper type
- Compared to HashMap, it is more memory efficient and simpler. Because HashMap uses arrays + linked lists or trees to store data, there is an additional Entry for storing hash, key, value, and the next Entry Node (Node), SparseArray maintains only two one-dimensional arrays internally for storing keys and values
- The core of SparseArray is binary search, and the time complexity is O(logN). If it is not found, it returns the left margin, and when it is inverted again, it is the index of the array that should be inserted.
- As the name suggests, the concept of a sparse array is used here to implement delete, not really delete, but to make a mark that reuses the element if it happens to be in that position when it is inserted again, otherwise it is compressed again if it runs out of space, optimising both efficiency and space.
- SparseArray is not designed for large data stores and is less efficient than HashMap when there are many data stores. Because it’s based on binary lookup, whereas HashMap hashes the subscripts directly without conflict, in order O(1) time.
- SparseArray can be used if memory requirements are high and the amount of data is less than a thousand, otherwise HashMap is more efficient