In essence, shared memory facilitates cross-process communication, reduces copy times, and improves performance.
But doesn’t Android already have Binder for cross-process communication? Why do you need anonymous shared memory? Let’s first look at this line of code when Binder is initialized.
#define BINDER_VM_SIZE ((1 * 1024 * 1024) - sysconf(_SC_PAGE_SIZE) * 2) ProcessState::ProcessState(const char *driver) : mDriverName(String8(driver)) , mDriverFD(open_driver(driver)) , mVMStart(MAP_FAILED) , mThreadCountLock(PTHREAD_MUTEX_INITIALIZER) , mThreadCountDecrement(PTHREAD_COND_INITIALIZER) , mExecutingThreadsCount(0) , mMaxThreads(DEFAULT_MAX_BINDER_THREADS) , mStarvationStartTimeMs(0) , mManagesContexts(false) , mBinderContextCheckFunc(NULL) , mBinderContextUserData(NULL) , mThreadPoolStarted(false) , mThreadPoolSeq(1) { if (mDriverFD >= 0) { // mmap the binder, providing a chunk of virtual address space to receive transactions. mVMStart = mmap(0, BINDER_VM_SIZE, PROT_READ, MAP_PRIVATE | MAP_NORESERVE, mDriverFD, 0); . }}Copy the code
As you can see, when Binder is initialized, the application limits the size of 1m-2 pages (1 page = 4K) to 1016K, which is fine for transferring commands, but not enough for transferring image data.
In addition, Binder has its own scheduling algorithm for each Binder kernel buffer, which cannot meet the fastest transmission to SF processes. For this reason, Android chooses to pass data using shared memory, also known as Ashmem anonymous memory.
The body of the
In fact, Ashmem can not only be used in the kernel, in fact, in the Java layer, Android also provides a class named MemoryFile to facilitate the use of anonymous shared memory, this time to MemoryFile as a cut, to talk about the use of Ashmem anonymous memory.
As always, let’s take a look at how memoryfiles are used.
MemoryFile memoryFile = null; Try {// Build a shared memory memoryFile = new memoryFile ("test",1024*5); OutputStream o = memoryFile.getOutputStream(); byte[] bs = new byte[1024]; bs[0] = 1; / / write o.w rite (bs, 0, 1); o.flush(); / / read InputStream in = memoryFile getInputStream (); Int r = in. Read (bs, 0, 1); Log.e("r","r:"+bs[0]); }catch(Exception e){ e.printStackTrace(); }finally { if(memoryFile ! = null){ memoryFile.close(); }}Copy the code
You can see that the operation is exactly the same as the normal File operation, as if there is no difference at all. File itself can also act as a data transfer station for passing information. So what’s the advantage of memoryfiles over regular files? Next, let’s take a look at the source code to see how anonymous memory compares to File and how it compares to Binder drivers.
MemoryFile source code parsing
The creation of MemoryFile
public MemoryFile(String name, int length) throws IOException {
try {
mSharedMemory = SharedMemory.create(name, length);
mMapping = mSharedMemory.mapReadWrite();
} catch (ErrnoException ex) {
ex.rethrowAsIOException();
}
}
Copy the code
You can see that MemoryFile actually has a core class within it, SharedMemory, as the core operation class. Let’s go see what SharedMemory created.
public static @NonNull SharedMemory create(@Nullable String name, int size) throws ErrnoException { ... return new SharedMemory(nCreate(name, size)); } private SharedMemory(FileDescriptor fd) { ... mFileDescriptor = fd; mSize = nGetSize(mFileDescriptor); . mMemoryRegistration = new MemoryRegistration(mSize); mCleaner = Cleaner.create(mFileDescriptor, new Closer(mFileDescriptor, mMemoryRegistration)); }Copy the code
SharedMemory first uses nCreate to create a file descriptor in native and associate it with SharedMemory. NGetSize is used to obtain the current SharedMemory size. Finally, the current size is registered to the Native stack size in Java virtual machine through MemoryRegistration, and the initial Cleaner is initialized. When appropriate, objects under Native are destroyed through GC linkage Cleaner.
private static final class MemoryRegistration { private int mSize; private int mReferenceCount; private MemoryRegistration(int size) { mSize = size; mReferenceCount = 1; VMRuntime.getRuntime().registerNativeAllocation(mSize); } public synchronized MemoryRegistration acquire() { mReferenceCount++; return this; } public synchronized void release() { mReferenceCount--; if (mReferenceCount == 0) { VMRuntime.getRuntime().registerNativeFree(mSize); }}}Copy the code
MemoryRegistration is essentially a registration of the size of the Native heap in the Java VIRTUAL machine. Each reference is counted once and the memory is destroyed only when it is reduced to zero. After all, this is shared memory and should not be solely determined by the Java VIRTUAL machine GC mechanism
After all, the core is the nCreate native method, which is followed by mapReadWrite
NCreate builds native underlying shared memory
static jobject SharedMemory_create(JNIEnv* env, jobject, jstring jname, jint size) { const char* name = jname ? env->GetStringUTFChars(jname, nullptr) : nullptr; int fd = ashmem_create_region(name, size); int err = fd < 0 ? errno : 0; if (name) { env->ReleaseStringUTFChars(jname, name); }... return jniCreateFileDescriptor(env, fd); }Copy the code
Create a shared memory region with ashmem_create_region. Remember the Linux adage, everything is a file, in fact anonymous shared memory is created as a file, but it is anonymous because it is in TMPFS temporary file system. Finally, Java file descriptor objects are created and associated with fd.
Let’s look at what ashmem_create_region does in Cutils.
int ashmem_create_region(const char *name, size_t size) { int ret, save_errno; int fd = __ashmem_open(); if (fd < 0) { return fd; } if (name) { char buf[ASHMEM_NAME_LEN] = {0}; strlcpy(buf, name, sizeof(buf)); ret = TEMP_FAILURE_RETRY(ioctl(fd, ASHMEM_SET_NAME, buf)); . } ret = TEMP_FAILURE_RETRY(ioctl(fd, ASHMEM_SET_SIZE, size)); . return fd; . }Copy the code
Creating anonymous shared memory consists of three steps:
- 1.__ashmem_open Creates anonymous shared memory
- 2. Use iocTL to name the anonymous shared memory. Only after the anonymous shared memory is named, the corresponding anonymous shared memory can be found through the name.
- 3. Ioctl runs the ASHMEM_SET_SIZE command to set the size of anonymous shared memory
__ashmem_open
This method eventually calls the following method:
#define ASHMEM_DEVICE "/dev/ashmem" static int __ashmem_open_locked() { int ret; struct stat st; int fd = TEMP_FAILURE_RETRY(open(ASHMEM_DEVICE, O_RDWR | O_CLOEXEC)); if (fd < 0) { return fd; } ret = TEMP_FAILURE_RETRY(fstat(fd, &st)); if (ret < 0) { int save_errno = errno; close(fd); errno = save_errno; return ret; } if (! S_ISCHR(st.st_mode) || ! st.st_rdev) { close(fd); errno = ENOTTY; return -1; } __ashmem_rdev = st.st_rdev; return fd; }Copy the code
Finally saw, open way, similar to the Binder to drive through the/dev/ashmem access ashmem driver file_operation the open method, finally to obtain the corresponding file descriptor fd.
To stop here, just remember the three steps of Ashmem creation:
- 1. Open the /dev/ashmem driver to connect to the ashmem driver
- 2. Ioctl sends the ASHMEM_SET_NAME command to create a name for the ashmem
- 3. Ioctl runs the ASHMEM_SET_SIZE command to set the size of anonymous shared memory
Sharememory. mapReadWrite Creates a memory-mapped cache
ShareMemory When ashmem anonymous shared memory is created, mapReadWrite is called
public @NonNull ByteBuffer mapReadWrite() throws ErrnoException { return map(OsConstants.PROT_READ | OsConstants.PROT_WRITE, 0, mSize); } public @NonNull ByteBuffer map(int prot, int offset, int length) throws ErrnoException { checkOpen(); validateProt(prot); . long address = Os.mmap(0, length, prot, OsConstants.MAP_SHARED, mFileDescriptor, offset); boolean readOnly = (prot & OsConstants.PROT_WRITE) == 0; Runnable unmapper = new Unmapper(address, length, mMemoryRegistration.acquire()); return new DirectByteBuffer(length, address, mFileDescriptor, unmapper, readOnly); }Copy the code
You can see that the map method ends up calling os.mmap. The essence of this method is to call the system call MMAP. Ashmem’s file descriptor mmap (file_ops) will be called, and a logical virtual memory will be mapped to file. When the system officially accesses this virtual memory, if it cannot find it, it will trigger a page miss interrupt (or try to perform a physical page swap from disk). At this time, the virtual memory and file bound to this logical section will be formally mapped to physical memory.
With this regular Mmap, virtual memory in user mode can be directly mapped to physical memory, which can be mapped by zero copies.
Also set the unmapped callback Unmapper in DirectByteBuffer
private static final class Unmapper implements Runnable { private long mAddress; private int mSize; private MemoryRegistration mMemoryReference; private Unmapper(long address, int size, MemoryRegistration memoryReference) { mAddress = address; mSize = size; mMemoryReference = memoryReference; } @Override public void run() { try { Os.munmap(mAddress, mSize); } catch (ErrnoException e) { /* swallow exception */ } mMemoryReference.release(); mMemoryReference = null; }}Copy the code
You can see that if a reference count of 0 is detected through mMemoryRegistration, munMap is called to unmap. Therefore, memoryfiles will read and write from the virtual memory based on the mMapping generated by mapReadWrite. Let’s look at the read and write methods of MemoryFile.
MemoryFile writes data
A write can be seen as an OutputStream that gets a MemoryFile.
private class MemoryOutputStream extends OutputStream { private int mOffset = 0; private byte[] mSingleByte; @Override public void write(byte buffer[], int offset, int count) throws IOException { writeBytes(buffer, offset, mOffset, count); mOffset += count; } @Override public void write(int oneByte) throws IOException { if (mSingleByte == null) { mSingleByte = new byte[1]; } mSingleByte[0] = (byte)oneByte; write(mSingleByte, 0, 1); }}Copy the code
You can see that in the write method, writeBytes is still essentially called as the core write method.
private void beginAccess() throws IOException { checkActive(); if (mAllowPurging) { if (native_pin(mSharedMemory.getFileDescriptor(), true)) { throw new IOException("MemoryFile has been purged"); } } } private void endAccess() throws IOException { if (mAllowPurging) { native_pin(mSharedMemory.getFileDescriptor(), false); } } public void writeBytes(byte[] buffer, int srcOffset, int destOffset, int count) throws IOException { beginAccess(); try { mMapping.position(destOffset); mMapping.put(buffer, srcOffset, count); } finally { endAccess(); }}Copy the code
During this process, native_pin is called to lock the virtual memory to avoid being reclaimed by the system. Finally, position of the mMapping is called to record the position after writing, and buffer data is written into the mMapping.
Wait a minute. Why don’t you call the write system call? If you’ve read my previous article, you know that the core principle of MMap is to map physical pages to virtual memory pages.
public ByteBuffer put(byte[] src, int srcOffset, int length) { ... checkBounds(srcOffset, length, src.length); int pos = position(); int lim = limit(); assert (pos <= lim); int rem = (pos <= lim ? lim - pos : 0); . Memory.pokeByteArray(ix(pos), src, srcOffset, length); position = pos + length; return this; }Copy the code
Put first sets how much data has been written and where to start, based on the position you set in. It then determines which chunk of memory to write by the length of the data passed in and the offset to write to.
At this point, put calls the memory.PokeByteArray method to write the content from the beginning of the virtual address offset to the end of the data length, that is, to the corresponding physical page.
static void Memory_pokeByteArray(JNIEnv* env, jclass, jlong dstAddress, jbyteArray src, jint offset, jint length) {
env->GetByteArrayRegion(src, offset, length, cast<jbyte*>(dstAddress));
}
Copy the code
Here is the schematic diagram:
MemoryFile reads data
private class MemoryInputStream extends InputStream { private int mMark = 0; private int mOffset = 0; private byte[] mSingleByte; . @Override public int read() throws IOException { if (mSingleByte == null) { mSingleByte = new byte[1]; } int result = read(mSingleByte, 0, 1); if (result ! = 1) { return -1; } return mSingleByte[0]; } @Override public int read(byte buffer[], int offset, int count) throws IOException { if (offset < 0 || count < 0 || offset + count > buffer.length) { // readBytes() also does this check, but we need to do it before // changing count. throw new IndexOutOfBoundsException(); } count = Math.min(count, available()); if (count < 1) { return -1; } int result = readBytes(buffer, mOffset, offset, count); if (result > 0) { mOffset += result; } return result; }... }Copy the code
The core read method that can see a MemoryInputStream still uses the readBytes method.
public int readBytes(byte[] buffer, int srcOffset, int destOffset, int count)
throws IOException {
beginAccess();
try {
mMapping.position(srcOffset);
mMapping.get(buffer, destOffset, count);
} finally {
endAccess();
}
return count;
}
Copy the code
Can see the core or get the mMapping DirectByteBuffer data, which calls the core call,
public ByteBuffer get(byte[] dst, int dstOffset, int length) { ... int pos = position(); int lim = limit(); assert (pos <= lim); int rem = (pos <= lim ? lim - pos : 0); . Memory.peekByteArray(ix(pos), dst, dstOffset, length); position = pos + length; return this; }Copy the code
It is also used to obtain the position that has been written. The position + offset is used as the starting point for reading data, and the length of reading data is obtained.
static void Memory_peekByteArray(JNIEnv* env, jclass, jlong srcAddress, jbyteArray dst, jint dstOffset, jint byteCount) {
env->SetByteArrayRegion(dst, dstOffset, byteCount, cast<const jbyte*>(srcAddress));
}
Copy the code
You can see that this is the time to get the target memory data, set to the srcAddress.
summary
After MemoryFile parsing, it is clear that Ashmem anonymous shared memory usage can be divided into 4 steps:
- 1. Open the /dev/ashmem driver to connect to the ashmem driver.
- 2. Ioctl sends the ASHMEM_SET_NAME command to create a name for the ashmem.
- 3. Ioctl sends the ASHMEM_SET_SIZE command to set the ashmem size
- 4. Mmap performs memory mapping.
- 5. Read and write the file descriptor.
As long as the first three steps are done, the process initializes to create a file descriptor for shared memory in the Ashmem driver, but at this point there is no associated equivalent to an anonymous memory identifier with that name. The size area of shared memory is also set
In the third step, the file is logically associated with the virtual memory by calling Mmap.
Step 4, the read and write will trigger the missing page interrupt, apply for the physical page and bind it.
From here I was reminded of some ridiculous comments on the Internet, when performing memory optimization, in order to reduce the size of the Java heap by passing some data through shared memory, so as to avoid Java memory detection. This is optimization, right? This is just using older versions of Android to detect memory vulnerabilities. For those of you familiar with the Linux kernel, this is just a smokescreen. Both Linux user and kernel use virtual memory (except for the physical page flow allocated by the memory management system) and are limited in size.
The Java virtual machine limits the size of the Java stack and the Java Native stack, because each process itself can apply for a limited amount of virtual memory. There is no real memory optimization at all.
Ashmem driver
Once we understand how to use the Ashmem driver, we will read what Ashmem does in the kernel according to the usage process, from initialization to usage.
Ashmem initialization
static int __init ashmem_init(void) { int ret; ashmem_area_cachep = kmem_cache_create("ashmem_area_cache", sizeof(struct ashmem_area), 0, 0, NULL); . ashmem_range_cachep = kmem_cache_create("ashmem_range_cache", sizeof(struct ashmem_range), 0, 0, NULL); . ret = misc_register(&ashmem_misc); . register_shrinker(&ashmem_shrinker); pr_info("initialized\n"); return 0; }Copy the code
Ashmem_area and ashmem_range are used in the slab cache. Ashmem_area and ashmem_range are used in the slab cache. The ashmem_area structure is used to cut out memory for user mode, and the ashmem_range structure is a linked list of memory blocks that are not locked and are reclaimed when the kernel needs them.
Finally, register_shrinker registers the Ashmem reclaim function with the memory management system.
The Ashmem file_operation
The quickest way to see how many operations there are in the file_operation structure of the driver copy, and which method each operation points to:
static const struct file_operations ashmem_fops = {
.owner = THIS_MODULE,
.open = ashmem_open,
.release = ashmem_release,
.read = ashmem_read,
.llseek = ashmem_llseek,
.mmap = ashmem_mmap,
.unlocked_ioctl = ashmem_ioctl,
#ifdef CONFIG_COMPAT
.compat_ioctl = compat_ashmem_ioctl,
#endif
};
Copy the code
There are four core methods: open, read, mmap, and unlockeD_ioctl. By analyzing these four methods under Ashmem, we can clear up what Ashmem is doing.
Ashmem open
#define ASHMEM_NAME_PREFIX "dev/ashmem/" #define ASHMEM_NAME_PREFIX_LEN (sizeof(ASHMEM_NAME_PREFIX) - 1) #define ASHMEM_FULL_NAME_LEN (ASHMEM_NAME_LEN + ASHMEM_NAME_PREFIX_LEN) static int ashmem_open(struct inode *inode, struct file *file) { struct ashmem_area *asma; int ret; ret = generic_file_open(inode, file); . asma = kmem_cache_zalloc(ashmem_area_cachep, GFP_KERNEL); . INIT_LIST_HEAD(&asma->unpinned_list); memcpy(asma->name, ASHMEM_NAME_PREFIX, ASHMEM_NAME_PREFIX_LEN); asma->prot_mask = PROT_MASK; file->private_data = asma; return 0; }Copy the code
Ashmem_area is allocated from ashmem_areA_cachep, and unpinned_list is initialized to unlock the head of the ashmem_area memory block list, and the name of the asMA anonymous region is set to /dev/ashmem.
Finally, set the current ashmem_area to private data for file.
Ashmem ioctl sets the name and size
The name is then set via ioctl, calling ASHMEM_SET_NAME.
static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct ashmem_area *asma = file->private_data; long ret = -ENOTTY; switch (cmd) { case ASHMEM_SET_NAME: ret = set_name(asma, (void __user *) arg); break; case ASHMEM_GET_NAME: ret = get_name(asma, (void __user *) arg); break; case ASHMEM_SET_SIZE: ret = -EINVAL; if (! asma->file) { ret = 0; asma->size = (size_t) arg; } break; case ASHMEM_GET_SIZE: ret = asma->size; break; . return ret; }Copy the code
Let’s start with a few simple commands, ASHMEM_SET_NAME sets the name by set_name; ASHMEM_GET_NAME Obtains the name from get_name. ASHMEM_SET_SIZE Sets the region size. Setting the size is essentially setting the size property in ASMA. Let’s look at the logic for setting the set_name of the name.
static int set_name(struct ashmem_area *asma, void __user *name) { int len; int ret = 0; char local_name[ASHMEM_NAME_LEN]; len = strncpy_from_user(local_name, name, ASHMEM_NAME_LEN); . if (len == ASHMEM_NAME_LEN) local_name[ASHMEM_NAME_LEN - 1] = '\0'; mutex_lock(&ashmem_mutex); /* cannot change an existing mapping's name */ if (unlikely(asma->file)) ret = -EINVAL; else strcpy(asma->name + ASHMEM_NAME_PREFIX_LEN, local_name); mutex_unlock(&ashmem_mutex); return ret; }Copy the code
/dev/ashmem = /dev/ashmem = /dev/ashmem = /dev/ashmem = /dev/ashmem = /dev/ashmem = /dev/ashmem Of course, once we judge that the ASMA has mapped file, we refuse to rename it again.
Ashmem mmap mapped memory
static int ashmem_mmap(struct file *file, struct vm_area_struct *vma) { struct ashmem_area *asma = file->private_data; int ret = 0; mutex_lock(&ashmem_mutex); /* user needs to SET_SIZE before mapping */ if (unlikely(! asma->size)) { ret = -EINVAL; goto out; } /* requested protection bits must match our allowed protection mask */ if (unlikely((vma->vm_flags & ~calc_vm_prot_bits(asma->prot_mask)) & calc_vm_prot_bits(PROT_MASK))) { ret = -EPERM; goto out; } vma->vm_flags &= ~calc_vm_may_flags(~asma->prot_mask); if (! asma->file) { char *name = ASHMEM_NAME_DEF; struct file *vmfile; if (asma->name[ASHMEM_NAME_PREFIX_LEN] ! = '\0') name = asma->name; / *... and allocate the backing shmem file */ vmfile = shmem_file_setup(name, asma->size, vma->vm_flags); if (unlikely(IS_ERR(vmfile))) { ret = PTR_ERR(vmfile); goto out; } asma->file = vmfile; } get_file(asma->file); if (vma->vm_flags & VM_SHARED) shmem_set_file(vma, asma->file); else { if (vma->vm_file) fput(vma->vm_file); vma->vm_file = asma->file; } out: mutex_unlock(&ashmem_mutex); return ret; }Copy the code
Asma must be set to size, otherwise exceptions will be thrown. Check whether the VMA virtual memory to be mapped complies with the permission. Otherwise, an exception is thrown.
Then check if the file structure in ASMA is created, if not, get the ASMA name and size and create a file descriptor through shmem_file_setup.
Check that the shmem_set_file mapping file is called if the current VMA virtual memory allows sharing.
Let’s look at shmem_file_setup and shmem_set_file.
shmem_file_setup
static struct file *__shmem_file_setup(const char *name, loff_t size, unsigned long flags, unsigned int i_flags) { struct file *res; struct inode *inode; struct path path; struct super_block *sb; struct qstr this; . res = ERR_PTR(-ENOMEM); this.name = name; this.len = strlen(name); this.hash = 0; /* will go */ sb = shm_mnt->mnt_sb; path.mnt = mntget(shm_mnt); path.dentry = d_alloc_pseudo(sb, &this); . d_set_d_op(path.dentry, &anon_ops); res = ERR_PTR(-ENOSPC); inode = shmem_get_inode(sb, NULL, S_IFREG | S_IRWXUGO, 0, flags); . inode->i_flags |= i_flags; d_instantiate(path.dentry, inode); inode->i_size = size; clear_nlink(inode); /* It is unlinked */ res = ERR_PTR(ramfs_nommu_expand_for_mapping(inode, size)); if (IS_ERR(res)) goto put_path; res = alloc_file(&path, FMODE_WRITE | FMODE_READ, &shmem_file_operations); . return res; put_memory: shmem_unacct_size(flags, size); put_path: path_put(&path); return res; }Copy the code
The following very important things are done in __shmem_file_setup:
- 1. D_instantiate Sets directory structure
- 2. Run the shmem_get_inode command to set the shared inode. The inode is the basic unit for Linux to access disk file systems and contains metadata such as the superblock.
- 3. Alloc_file applies for a file structure and overwrites the file_operation file in the file structure.
Let’s look at shmem_file_operations:
static const struct file_operations shmem_file_operations = { .mmap = shmem_mmap, #ifdef CONFIG_TMPFS .llseek = shmem_file_llseek, .read = new_sync_read, .write = new_sync_write, .read_iter = shmem_file_read_iter, .write_iter = generic_file_write_iter, .fsync = noop_fsync, .splice_read = shmem_file_splice_read, .splice_write = iter_file_splice_write, .fallocate = shmem_fallocate,// prelocate physical memory #endif};Copy the code
With shmem_file_setup, the Ashmem driver converts the FILE structure in the VMA to shared memory.
But the name shmem tells you that this is actually shared memory in Linux.
shmem_set_file
void shmem_set_file(struct vm_area_struct *vma, struct file *file)
{
if (vma->vm_file)
fput(vma->vm_file);
vma->vm_file = file;
vma->vm_ops = &shmem_vm_ops;
}
Copy the code
You can see that the method essentially associates the vm_file with the file structure and sets the virtual memory manipulation function:
static const struct vm_operations_struct shmem_vm_ops = {
.fault = shmem_fault,
.map_pages = filemap_map_pages,
#ifdef CONFIG_NUMA
.set_policy = shmem_set_policy,
.get_policy = shmem_get_policy,
#endif
.remap_pages = generic_file_remap_pages,
};
Copy the code
This structure is particularly important where the fault operator, shmem_FAULT, refers to how shared memory binds physical pages when it receives a page-missing interrupt.
At this time, only the file file corresponding to the VMA and anonymous shared memory is logically associated. When we try to read and write this section of virtual memory, we find that there is no mapping, and corresponding data is not saved on the hard disk to swap in, so a section of physical memory will be bound.
Forget about calling shmem_fault and we’ll talk about that later, but let’s see what this method does.
static int shmem_fault(struct vm_area_struct *vma, struct vm_fault *vmf) { struct inode *inode = file_inode(vma->vm_file); int error; int ret = VM_FAULT_LOCKED; if (unlikely(inode->i_private)) { struct shmem_falloc *shmem_falloc; spin_lock(&inode->i_lock); shmem_falloc = inode->i_private; if (shmem_falloc && shmem_falloc->waitq && vmf->pgoff >= shmem_falloc->start && vmf->pgoff < shmem_falloc->next) { ... } spin_unlock(&inode->i_lock); } error = shmem_getpage(inode, vmf->pgoff, &vmf->page, SGP_CACHE, &ret); . return ret; }Copy the code
When a page miss interrupt is triggered, the pre-allocated physical memory will be looked up. If not, shmem_getPage will be called directly to bind the physical and virtual memory pages in VMF.
static int shmem_getpage_gfp(struct inode *inode, pgoff_t index, struct page **pagep, enum sgp_type sgp, gfp_t gfp, int *fault_type) { struct address_space *mapping = inode->i_mapping; struct shmem_inode_info *info; struct shmem_sb_info *sbinfo; struct mem_cgroup *memcg; struct page *page; swp_entry_t swap; int error; int once = 0; int alloced = 0; if (index > (MAX_LFS_FILESIZE >> PAGE_CACHE_SHIFT)) return -EFBIG; repeat: swap.val = 0; page = find_lock_entry(mapping, index); . /* * Fast cache lookup did not find it: * bring it back from swap or allocate. */ info = SHMEM_I(inode); sbinfo = SHMEM_SB(inode->i_sb); if (swap.val) { .... } else { if (shmem_acct_block(info->flags)) { error = -ENOSPC; goto failed; } if (sbinfo->max_blocks) { if (percpu_counter_compare(&sbinfo->used_blocks, sbinfo->max_blocks) >= 0) { error = -ENOSPC; goto unacct; } percpu_counter_inc(&sbinfo->used_blocks); } page = shmem_alloc_page(gfp, info, index); if (! page) { error = -ENOMEM; goto decused; } __SetPageSwapBacked(page); __set_page_locked(page); if (sgp == SGP_WRITE) __SetPageReferenced(page); error = mem_cgroup_try_charge(page, current->mm, gfp, &memcg); if (error) goto decused; error = radix_tree_maybe_preload(gfp & GFP_RECLAIM_MASK); if (! error) { error = shmem_add_to_page_cache(page, mapping, index, NULL); radix_tree_preload_end(); } if (error) { mem_cgroup_cancel_charge(page, memcg); goto decused; } mem_cgroup_commit_charge(page, memcg, false); lru_cache_add_anon(page); spin_lock(&info->lock); info->alloced++; inode->i_blocks += BLOCKS_PER_PAGE; shmem_recalc_inode(inode); spin_unlock(&info->lock); alloced = true; /* * Let SGP_FALLOC use the SGP_WRITE optimization on a new page. */ if (sgp == SGP_FALLOC) sgp = SGP_WRITE; clear: /* * Let SGP_WRITE caller clear ends if write does not fill page; * but SGP_FALLOC on a page fallocated earlier must initialize * it now, lest undo on failure cancel our earlier guarantee. */ if (sgp ! = SGP_WRITE) { clear_highpage(page); flush_dcache_page(page); SetPageUptodate(page); } if (sgp == SGP_DIRTY) set_page_dirty(page); } /* Perhaps the file has been truncated since we checked */ ... *pagep = page; return 0; . }Copy the code
Here are the following things to do:
- 1. Get the in-page offset first and try to find if the virtual address holds the physical memory page
- 2. Check whether the flag bit or capacity of the superblock in the inode exceeds the preset threshold
- 3. Shmem_alloc_page Through alloc_page The partner system applies for binding physical pages
- 4. Mem_cgroup_try_charge Records the current page cache. Mem_cgroup_commit_charge is committed to Linux. The cgroup mechanism is designed to detect requested memory and find opportunities to reclaim it.
- 5. Finally, attach the page to the cardinality tree of maping in the address_space structure.
By the way, the address_space structure is used to record file and memory associations. A radix tree is actually a tree that generates multiple branches with a bit as the key. A bit like Huffman tree, read out all the bit bits of a key, take out one or more bits of the key to generate multiple stages of the tree, as long as all the bit bits of the key are read out, you can find the content.
Is a very fast mapping data structure, borrowed from a graph on the web:
This completes the mapping, which is actually much closer to an ext4 filesystem than Binder’s MMap.
Ashmem driver reads and writes
Remember that read/write operations do not read/write files generated by Ashmem, but read/write mapped areas. In other words, the shared memory address area is directly read/write, without a write, read system call, and will not go to their corresponding file_operation.
Ashmem lock and unlock
Remember ashmem_range, another data structure we talked about in the Ashmem initialization section? That’s where it comes in. Therefore, Ashmem applies for anonymous sharing by virtue of shmem shared memory function, so what is the difference between Ashmem and SHmem? In fact, the difference lies in the locking and unlocking of this mapping region.
Looking back at the MemoryFile, we can see that the native_pin method is called once for every call read or write:
Native_pin (mSharedMemory getFileDescriptor (), true) / / locking native_pin (mSharedMemory. GetFileDescriptor (), false) / / unlockCopy the code
The call method in native layer is as follows:
static jboolean android_os_MemoryFile_pin(JNIEnv* env, jobject clazz, jobject fileDescriptor,
jboolean pin) {
int fd = jniGetFDFromFileDescriptor(env, fileDescriptor);
int result = (pin ? ashmem_pin_region(fd, 0, 0) : ashmem_unpin_region(fd, 0, 0));
if (result < 0) {
jniThrowException(env, "java/io/IOException", NULL);
}
return result == ASHMEM_WAS_PURGED;
}
Copy the code
int ashmem_pin_region(int fd, size_t offset, size_t len)
{
// TODO: should LP64 reject too-large offset/len?
ashmem_pin pin = { static_cast<uint32_t>(offset), static_cast<uint32_t>(len) };
int ret = __ashmem_is_ashmem(fd, 1);
if (ret < 0) {
return ret;
}
return TEMP_FAILURE_RETRY(ioctl(fd, ASHMEM_PIN, &pin));
}
int ashmem_unpin_region(int fd, size_t offset, size_t len)
{
// TODO: should LP64 reject too-large offset/len?
ashmem_pin pin = { static_cast<uint32_t>(offset), static_cast<uint32_t>(len) };
int ret = __ashmem_is_ashmem(fd, 1);
if (ret < 0) {
return ret;
}
return TEMP_FAILURE_RETRY(ioctl(fd, ASHMEM_UNPIN, &pin));
}
Copy the code
You can see that you are essentially encapsulating the offset and the length len is writing to into an ashmem_PIN structure. So they’re all zeros.
This method essentially calls the ioctl commands ASHMEM_PIN and ASHMEM_UNPIN:
static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { struct ashmem_area *asma = file->private_data; long ret = -ENOTTY; switch (cmd) { ... case ASHMEM_PIN: case ASHMEM_UNPIN: case ASHMEM_GET_PIN_STATUS: ret = ashmem_pin_unpin(asma, cmd, (void __user *) arg); break; . return ret; }Copy the code
static int ashmem_pin_unpin(struct ashmem_area *asma, unsigned long cmd, void __user *p) { struct ashmem_pin pin; size_t pgstart, pgend; int ret = -EINVAL; . if (unlikely(copy_from_user(&pin, p, sizeof(pin)))) return -EFAULT; . pgstart = pin.offset / PAGE_SIZE; pgend = pgstart + (pin.len / PAGE_SIZE) - 1; mutex_lock(&ashmem_mutex); switch (cmd) { case ASHMEM_PIN: ret = ashmem_pin(asma, pgstart, pgend); break; case ASHMEM_UNPIN: ret = ashmem_unpin(asma, pgstart, pgend); break; case ASHMEM_GET_PIN_STATUS: ret = ashmem_get_pin_status(asma, pgstart, pgend); break; } mutex_unlock(&ashmem_mutex); return ret; }Copy the code
The start and end addresses of the regions to be locked or unlocked are calculated. Pgstart is calculated by removing the size of a page by 4KB, so that we can get the offset page. Since there is no remainder, we can get the start of the current page. Pgend is pgstart plus the number of pages taken up by length minus 1.
In fact, we can see from the calculation that ashmem lock area must be locked and unlocked by the most basic unit of page.
Before reading the source code, it is important to understand that the default mmap addresses are locked.
Ashmem unlock ashmem_unpin
static int ashmem_unpin(struct ashmem_area *asma, size_t pgstart, size_t pgend) { struct ashmem_range *range, *next; unsigned int purged = ASHMEM_NOT_PURGED; restart: list_for_each_entry_safe(range, next, &asma->unpinned_list, unpinned) { /* short circuit: this is our insertion point */ if (range_before_page(range, pgstart)) break; /* * The user can ask us to unpin pages that are already entirely * or partially pinned. We handle those two cases here. */ if (page_range_subsumed_by_range(range, pgstart, pgend)) return 0; if (page_range_in_range(range, pgstart, pgend)) { pgstart = min_t(size_t, range->pgstart, pgstart), pgend = max_t(size_t, range->pgend, pgend); purged |= range->purged; range_del(range); goto restart; } } return range_alloc(asma, range, purged, pgstart, pgend); }Copy the code
In this process, you need to compare ranges that have been unlocked for each item from the unpinned_list list. It can be roughly divided into five cases: in case 1, 2, and 3, when the unlocked range has an intersection between PagestArt and PageEnd, it will be merged.
In the fourth case, since range already contains PagestArt and PageEnd, there is no need to deal with it
In the fifth case, since range and the area to be unlocked do not intersect, and range does not need to traverse in front of the area to be unlocked, there is no need to merge.
range_alloc
static int range_alloc(struct ashmem_area *asma, struct ashmem_range *prev_range, unsigned int purged, size_t start, size_t end) { struct ashmem_range *range; range = kmem_cache_zalloc(ashmem_range_cachep, GFP_KERNEL); . range->asma = asma; range->pgstart = start; range->pgend = end; range->purged = purged; list_add_tail(&range->unpinned, &prev_range->unpinned); if (range_on_lru(range)) lru_add(range); return 0; }Copy the code
An asMA_range is created from ashmem_range_cachep, its start and end addresses are set, and the end of amSA’s unpinnedlist is added.
static inline void lru_add(struct ashmem_range *range)
{
list_add_tail(&range->lru, &ashmem_lru_list);
lru_count += range_size(range);
}
Copy the code
Finally, set the current range to the end of the global variable ashmem_lru_list and record the total unlocked size.
Ashmem lock ashmem_pin
static int ashmem_pin(struct ashmem_area *asma, size_t pgstart, size_t pgend)
{
struct ashmem_range *range, *next;
int ret = ASHMEM_NOT_PURGED;
list_for_each_entry_safe(range, next, &asma->unpinned_list, unpinned) {
/* moved past last applicable page; we can short circuit */
if (range_before_page(range, pgstart))
break;
/*
* The user can ask us to pin pages that span multiple ranges,
* or to pin pages that aren't even unpinned, so this is messy.
*
* Four cases:
* 1. The requested range subsumes an existing range, so we
* just remove the entire matching range.
* 2. The requested range overlaps the start of an existing
* range, so we just update that range.
* 3. The requested range overlaps the end of an existing
* range, so we just update that range.
* 4. The requested range punches a hole in an existing range,
* so we have to update one side of the range and then
* create a new range for the other side.
*/
if (page_range_in_range(range, pgstart, pgend)) {
ret |= range->purged;
/* Case #1: Easy. Just nuke the whole thing. */
if (page_range_subsumes_range(range, pgstart, pgend)) {
range_del(range);
continue;
}
/* Case #2: We overlap from the start, so adjust it */
if (range->pgstart >= pgstart) {
range_shrink(range, pgend + 1, range->pgend);
continue;
}
/* Case #3: We overlap from the rear, so adjust it */
if (range->pgend <= pgend) {
range_shrink(range, range->pgstart, pgstart-1);
continue;
}
/*
* Case #4: We eat a chunk out of the middle. A bit
* more complicated, we allocate a new range for the
* second half and adjust the first chunk's endpoint.
*/
range_alloc(asma, range, range->purged,
pgend + 1, range->pgend);
range_shrink(range, range->pgstart, pgstart - 1);
break;
}
}
return ret;
}
Copy the code
Similarly, in the case of locking, we can follow the four conditions in the figure above:
- 1. Case 1 If the memory blocks start and end contain the range range, remove the range from the unpinned_list
- 2. If the start and end parts of the unpinned_list block intersect, change the start address of range to the end address of the block
- 3. If the first half of the start and end blocks intersect, modify the end address of range in the unpinned_list to be the start address of the block
- 4. If the range contains the memory block to be locked, dig a hole to remove the middle part of the range and add two new ranges to the unpinned_list
The fifth case doesn’t intersect and I don’t care.
Ashmem memory reclamation
Since there is an area of memory unlocked by unpinned_list, when is it reclaimed? This operation structure was mentioned in the initialization, so let’s look directly at what this structure does.
static struct shrinker ashmem_shrinker = {
.count_objects = ashmem_shrink_count,
.scan_objects = ashmem_shrink_scan,
.seeks = DEFAULT_SEEKS * 4,
};
Copy the code
At the heart of this is the scan function ashmem_shrink_scan
static unsigned long
ashmem_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
{
struct ashmem_range *range, *next;
unsigned long freed = 0;
/* We might recurse into filesystem code, so bail out if necessary */
...
list_for_each_entry_safe(range, next, &ashmem_lru_list, lru) {
loff_t start = range->pgstart * PAGE_SIZE;
loff_t end = (range->pgend + 1) * PAGE_SIZE;
range->asma->file->f_op->fallocate(range->asma->file,
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
start, end - start);
range->purged = ASHMEM_WAS_PURGED;
lru_del(range);
freed += range_size(range);
if (--sc->nr_to_scan <= 0)
break;
}
mutex_unlock(&ashmem_mutex);
return freed;
}
Copy the code
As you can see, the procedure unlocks the global list that was put in before the loop, and calls lru_del repeatedly to delete the rang item and recalculate the remaining space. More importantly, the fallocate file operation call is called.
Remember that this file operation is set in __shmem_file_setup? Let’s go to shmem and look at the logic of solving the mapping in the shmem_fallocate method
static long shmem_fallocate(struct file *file, int mode, loff_t offset, loff_t len) { struct inode *inode = file_inode(file); struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); struct shmem_inode_info *info = SHMEM_I(inode); struct shmem_falloc shmem_falloc; pgoff_t start, index, end; int error; if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE)) return -EOPNOTSUPP; mutex_lock(&inode->i_mutex); if (mode & FALLOC_FL_PUNCH_HOLE) { struct address_space *mapping = file->f_mapping; loff_t unmap_start = round_up(offset, PAGE_SIZE); loff_t unmap_end = round_down(offset + len, PAGE_SIZE) - 1; DECLARE_WAIT_QUEUE_HEAD_ONSTACK(shmem_falloc_waitq); /* protected by i_mutex */ if (info->seals & F_SEAL_WRITE) { error = -EPERM; goto out; } shmem_falloc.waitq = &shmem_falloc_waitq; shmem_falloc.start = unmap_start >> PAGE_SHIFT; shmem_falloc.next = (unmap_end + 1) >> PAGE_SHIFT; spin_lock(&inode->i_lock); inode->i_private = &shmem_falloc; spin_unlock(&inode->i_lock); if ((u64)unmap_end > (u64)unmap_start) unmap_mapping_range(mapping, unmap_start, 1 + unmap_end - unmap_start, 0); shmem_truncate_range(inode, offset, offset + len - 1); /* No need to unmap again: hole-punching leaves COWed pages */ spin_lock(&inode->i_lock); inode->i_private = NULL; wake_up_all(&shmem_falloc_waitq); spin_unlock(&inode->i_lock); error = 0; goto out; }... out: mutex_unlock(&inode->i_mutex); return error; }Copy the code
Because the FALLOC_FL_PUNCH_HOLE flag is set at the time of the call, it will go into this, you can see the core method, unmap_mapping_range to unmap the physical memory and virtual memory. Shmem_truncate_range is also called to release the mapping region saved in the address_space mapping. And set the same size file void. This void means that it can be accessed beyond the current file size, of course within the reserved void size, which will stretch the file when writing.
conclusion
The process of using Ashmem is as follows:
-
- Ashmem_create_region creates an anonymous shared memory region, essentially calling the open system call
- The ioctl sets the name and size of the shared memory. The name is /dev/ashmem/, and the name is there so that others can find the target
- 3. Virtual memory and physical memory in the Mmap mapping file
- 4. Read and write directly to the address area.
Ioctl must be set to name and size, otherwise the mapping cannot be performed because validation is performed before mapping.
The MMAP step actually associates the anonymous shared memory region with the file structure and sets the shmeM shared memory file operator and shared memory VMA operation. At this point Ashmem hands off the work to Shmem, and finally maps virtual memory to physical pages using the alloc_page method.
So Ashmem essentially relies on shmem shared memory. So what does Ashmem have to do with Shmem? From its name, we can know that A stands for auto, which means that the memory system can automatically reclaim the required memory.
Therefore, there is an important mechanism locking and unlocking in Ashmem. Generally, all unlocks are managed in the current AMSA unpinned_list and recorded in the global ashmem_lru_list variable. Locking removes memory block records added to unpinned_list and ashmem_lru_list.
Therefore, the entire memory block generated by MMAP is unlocked by default. And just because it was added to the global ashmem_lru_list, it had the memory management system go through the ashmem_lru_list and unmap the memory through the shmem_fallocate file operation, And leave a hole in the file (which is to find a way to fill up the memory again through page caching, reapplication, etc.),
The schematic diagram is as follows:
thinking
So what’s the difference between Ashmem and Binder?
The main difference is that Binder’s MMap already binds the physical page to virtual memory through the partner system, whereas Ashmem binds through the missing page and calls the related function. In other words Ashmem is loaded on demand, Binder is assigned from the start via MMAP. Of course, this is also related to the Binder mechanism, because Binder starts to communicate with each other once an Android is started, and Binder needs to set up a kernel buffer in its Binder driver to represent the process through mMAP, so that it can communicate with each other at first. There is no need to wait for interruptions to apply for physical pages. Binder makes more sense from a design point of view.
What are the pros and cons of designing between the two? Ashmem is clearly about opening up a large memory channel for processes to communicate big data. Binder, on the other hand, prefers small-scale instructions with a clear direction and sequence to ensure the reliability of each instruction. Functionally similar, but from a design standpoint, Binder manages extremely complex data structures to ensure the reliability of each instruction.
Author: yjy239 links: www.jianshu.com/p/6a8513fdb… The copyright of the book belongs to the author. Commercial reprint please contact the author for authorization, non-commercial reprint please indicate the source.