IOS storage solutions from beginner to master
Guide language
In the process of business development, the processing of data always occupies a large part of the time. You can choose different storage solutions based on service requirements. According to the location of data storage, storage can be divided into memory storage and hard disk storage. Memory storage, data exchange fast, in some complex time-consuming place can consider memory cache to solve. While disk storage, due to the characteristics of the disk itself, when the operation is triggered, some physical operations such as disk channel change are required, and the interaction speed will be greatly reduced. However, because a large amount of data can be saved, the data can be held all the time, and the data can be re-loaded into the program after the APP is closed. So there are a lot of ways to save data to disk on iOS.
In the following sections, I will introduce the application scenarios of memory storage and disk storage respectively. Since memory storage is relatively simple, I will focus on disk storage space. And disk storage can be divided into file system and database system, and database system is also the place where we most problems encountered in the business, so in the applicable scenario of simple introduction file system and the principle and advantages and disadvantages of each file storage solution, will spend a lot of space database system is introduced in this paper, the database system, on the mobile end solution, SQLite database solutions are also the most widely used. I’ll introduce SQLite to explore some performance issues. Since multithreading has always been mysterious and error-prone, take a look at SQLite from the ground up.
Memory storage
Memory storage can also be called a memory cache because the data stored in memory is only retained when the APP starts. For example, some data obtained from the server can be saved to relieve the pressure on the server, save user traffic and time, and improve user experience. NSURLConnection caches resources in memory by default. The in-memory cache can also hold some processed data, such as a Feed stream that stores processed data in the in-memory cache and fetches it directly from the cache when it is used again. Common memory cache frameworks include NSCache, TMMemoryCache, PINMemoryCache, and YYMemoryCache. NSCache is a simple memory cache provided by Apple that has an API similar to NSDictionary except that it is thread-safe. YYMemoryCache internally uses bidirectional linked list and NSDictionary to implement LRU elimination algorithm (Least recently used). The general process of memory caching is as follows:
- The APP preferentially requests resources in the memory buffer
- If there is a resource file in the memory buffer, the resource file is directly returned. If there is no resource file, the resource file is requested. In this case, the resource file may be stored on the server and need to be obtained by network request, or it may be a local file that needs to be obtained by operating the file system or database.
- The obtained resource files are first cached in the memory cache to save time.
- And then you get the data from the cache and you give it to the app.
Disk storage
Memory storage is suitable for storing files that are frequently used by the APP and take up little space. Disk storage can store some files and information that need to be persisted. In simple terms, after the app is killed, the file still exists. According to data management methods, disk storage can be divided into two categories: file system storage and database system storage. File systems organize data into independent data files. The internal structure of the record is realized, but the whole has no structure. And the database system realizes the structuralization of the whole data. Database system mainly manages the storage of database, transaction, and the operation of database. The file system is the subsystem of the operating system that manages files and storage space.
Select the disk storage mode
Depending on the characteristics of the operating system and file system, if the stored data is structured and you want easy statistical analysis, you can choose the database for storage. If the storage data structure is single and a large amount of data may be stored, file system management can be used. For example, in Baidu Music App, song information includes song playing link, song name, singer name, album and other information. The data volume of these information is small but structured, and frequent statistics and fine-grained data processing are required for all songs. It’s perfect for database management. And downloaded to the local song, downloaded to the local video and other information, is very suitable for the file management system to store data.
File system management mode
A file system is an operating system subroutine that manages files and storage space. There are also some file storage schemes available in iOS. Storage schemes include PList file storage, NSUserDefalut storage, keyChain storage, and NSKeyedArchiver (serialized storage).
file
Plist files are typically used to store user Settings, as well as data that is often used in programs and not often changed. For example, common colors and configuration information within an app can be stored in PLIST. The structure is clear and easy to find. A PLIST file is a file that stores serialized objects. The file format is XML. The PList file can be read by the system, but cannot be modified directly. (Plist can be modified in a different way, but it is not recommended.)
NSUserDefaults storage
NSUserDefaults are easy to use. If you want to store simple strings, like strings, numbers, etc., NSUserDefaults is the first choice. NSUserDefaults is a cache of plist files, and when you write data in NSUserDefaults, you’re actually writing data to a plist file that’s just for NSUserDefaults. NSUserDefaults can read and write to this particular PList file. Since NSUserDefaults reads data from plist files into the cache, access is fast.
KeyChain
KeyChain is a secure storage container suitable for storing sensitive information on devices. The data in the KeyChain is independent of each app sandbox. Even if the APP is uninstalled, the stored information still exists as long as the system is not re-installed. After the app is re-installed, the stored information can still be used by the APP. Keychain access groups allow different applications to share data in a keychain. Specify a group when saving data to the keychain. It is found that some Apps of Baidu realize account intercommunication, which should be realized through keychain access groups. KeyChain stores data that users want to use the next time an app is installed, even if it is deleted.
KeyChain is a sqlite database data, in the/private/var/Keychains/KeyChain – 2 db, save all the data is encrypted. In this sense, KeyChain data is stored in the database, but it is only used to store simple data, so it is placed in the file category.
NSKeyedArchiver
Almost any type of object can be archived. You can use NSKeyedArchiver for archiving and NSKeyedUnarchiver for archiving. This method serializes and deserializes the data before writing and reading it. For the system’s own simple objects (such as NSString, NSNumber, etc., in the Foundation framework) can be archived directly, and for the user-defined objects need to implement NSCoding protocol, encode and decode methods.
NSKeyedArchiver
NSUSerDefaults
plist
NSKeyedArchiver
File storage
Convert the data into NSData objects, and then use system functions directly to save the data to the specified file directory. Pictures, songs and videos are usually stored in this way.
feeling
As you can see, iOS provides some very convenient ways to store files. If some data structures are simple, the number of data stored is not very large, or the amount of single data is too large (audio files), some information need not be modified, choose the method provided by iOS system can complete the task requirements.
The database
When the amount of data information is large and the structure is large, it can be saved into files. Only after all files are read, can the statistics and related modification operations of files be carried out. And the file system can’t understand the relationship between the data inside the file. Hence the database system, which is dedicated to maintaining data and the relationships between data. A database can be understood as a warehouse management system. The warehouse of a furniture factory, want to take furniture, add new furniture, change original furniture to new furniture style, need to come true through warehouse management system. It’s just that the DATABASE management system manages data.
IOS also provides a number of operational database implementations, including CoreData and direct operational SQlite databases. Since CoreData’s underlying storage is generally SQLite database, this article will understand SQLite’s underlying principle and read-write lock control in detail, and try to explain what CoreData multithreading actually does.
Basic Concepts of Database
Before you get to know SQLite database, understand the basic concepts of database, including what is a library, what is a table, and what are SQL statements
library
The repository is equivalent to a large warehouse. When we want to use a warehouse we need to build a warehouse first. The syntax for creating a database is:
create database xxx;
Copy the code
table
The table corresponds to a specific room in the warehouse. Some storerooms are for clothes and some are for snacks. A table consists of fields and records. Fields are used to organize the table structure (that is, the organizational repository structure), and records are based on fields to hold specific things.
CREATE TABLE `CLOTHES` (
`key` int(11) NOT NULL AUTO_INCREMENT,
`value` char(255) NOT NULL DEFAULT ' ',
PRIMARY KEY (`k`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8
Copy the code
And there it is. Key and value are fields. And the actual storage of every piece of clothing is a record.
SQL
SQL is the equivalent of what we say to the warehouse administrator. Let’s say we want to get size 3 clothes out of Room 2 in warehouse 1. With that, the warehouse manager is ready to act. SQL is the instructions we use to deal with databases. SQL statements can be roughly divided into the following categories:
- Data Definition (
SQL DDL
) is used to define the create and destroy operations of SQL schemas, base tables, views, and indexes - Data manipulation (
SQL DML
Data manipulation can be divided into two types: data query and data update. Data update is divided into insert, delete and modify operations. That’s what we call add, delete, change and check. - Data control (
DCL
) including basic table authorization, integrity description, transaction control and so on.
The most common database models are mainly two kinds, namely relational database and non-relational database.
Relational database
The relational database model is to reduce the complex data structure into a simple binary relationship (i.e., a two-dimensional table form). In relational database, the operation of data is almost all established on one or more relational tables, and the management of database is realized through the operation of classification, merging, connection or selection of these associated tables. There are Oracle and MySQL on the server. The most commonly used SQlite on mobile is a relational database. This article explains SQLite in detail.
Non-relational databases
Non-relational databases have very good performance in super large scale and high concurrency SNS type web2.0 pure dynamic sites. NoSQL(NoSQL = Not Only SQL), meaning “more than SQL”, refers to non-relational database. Non-relational database is divided into key-value database (such as Redis database), Column storage (column-oriented) database, document-oriented database, graphics database. Especially in the Internet era, the server is widely used in non – relational row database. Representative on the mobile side is Realm.
###Sqlite
SQLite, a lightweight database, is an ACID compliant relational database management system. It is designed for embedded systems, requires few resources, requires only a few hundred K, and is cross-platform. SQLite database is built into the mainstream mobile operating systems Android and iOS.
ACID refers to four characteristics that a database management system (DBMS) must have in order to ensure that a transaction is correct and reliable when data is written or updated: Atomicity, consistency, Isolation, Durability.
To optimize SQLite speed, you must understand how SQLite works.
Backend
B-Tree
Pager
OS
Accessories
For example, an operation on the Employee table would look like this:
The process is as follows: 1. SQL statement > 2. Triggering disk I/O >3. The disk returns a Page of the Index table corresponding to a table, containing several entries -> 4. Find the target record among the above records. If no record that meets the condition is found, repeat steps 2 and 3 until the record that meets the condition is found -> 5. Locate the original position of the Index table from the position stored in the Index table ->6. Perform an operation on the target record in the original table to complete the SQL operation. There are a couple of terms here: Index table and primitive table, which will be explained next.
Disk I/O bottleneck
We know that database files are actually on disk. Disk operations are much more time-consuming than memory operations. Typically, a query cannot be completed with a single I/O operation. So how to reduce the number of disk I/O becomes the key to optimize SQlite performance. #### Disk Read Mode The disk read mode is in Page. A page is the basic logical unit used in computer storage. Data storage and interactions in memory and disk are on a page basis. Even if only one byte of data is needed in memory, one or more pages are fetched from disk, a system-level pre-caching strategy.
The Index table
Knowing that disk I/O is a performance drain, we can see that if there are more records in a Page, the probability of each SQL I/O being hit is higher. If you want to query a complete table, such as a table with 30 fields, and a field takes 100 bytes, then a record takes about 30 *100 = 3000 bytes. Assuming a page size of 4KB, only one record exists on a page. If we use a table to map to the original table, but keep only the index fields, such as the key field, and the corresponding position field of the original table, then a Page can hold more than 100 records. The corresponding table to the original table is called an index table.
B tree
Although the binary tree can achieve log(n) lookup, and insert, because each node of the binary tree actually corresponds to a Page of the disk. So in order to reduce the number of disk I/ OS, you need to reduce the height of the tree as much as possible. A binary tree is like a tall, thin tree, and we need to find a short, fat tree. And B trees do exactly that. A B tree is a multi-path lookup balanced tree with a maximum of K children per node. K is called the order of the B tree. The size of k depends on the size of the disk page.
SQLite index tables and raw tables are organized as B-trees. If the Index table is not created, the size of each entry in the original table determines the order of the B-tree. The larger the order, the dumber the B tree. Minimize the size of the original table and Index table. Use strings sparingly and use numeric fields instead. Because numeric fields take up fewer bytes.
SQlite files
db
.db
.db_wal
.db_shm
.db
table
.db_wal
_journal
wal
wal
.db_wal
.db_wal
wal
.db
_wal
db
wal
_shm
-wal
sqlite
wal
SQLite multithreading
Sqlite supports multithreaded access. SQLite supports three threading modes:
- Single thread: All files are disabled under single thread
mutex
Lock, an error occurs when used concurrently. - Multithreading: As long as a database connection is not used by multiple threads, it is safe. The bottom line is to disable database connections and
prepared statement
On the lock, to achieve multithreading. Therefore, the same database connection and cannot be used concurrently in multiple threadsprepared statement
. - Serial: Enable all locks, including
bCoreMutex
和bFullMutex
. Because the database connection andprepared statement
They’re locked, so when multiple threads use these objects, they can’t be concurrent, so they’re serial.
- Database connection: each time open a database, to obtain
database
It’s a database connection
static sqlite3 *openDb() {
if(sqlite3_open(dbPath, &database) ! = SQLITE_OK) { sqlite3_close(database); NSLog(@"Failed to open database: %s", sqlite3_errmsg(database));
}
return database;
}
Copy the code
Prepared statement
, which is managed by the database connection, and can also be viewed as using the database connection. Therefore, in multithreaded mode, concurrent calls are made to the same database connectionsqlite3_prepare_v2()
To create aprepared statement
, or any connection to the same databaseprepared statement
Concurrent callsqlite3_blind_*()
andsqlite3_step()
And so the function will report an error.
The standard distribution of SQLite is in serial mode, while the built-in SQLite library for iOS is in multi-threaded mode, and Python’s SQLite is in serial mode.
There are four modes of accessing the database to choose from, depending on thread-safety considerations under various modes:
- SQLite uses a single-threaded mode, where a dedicated thread accesses the database and requires communication between threads, which is cumbersome to implement.
- SQlite uses single-threaded mode, with a single thread queue to access the database. The queue allows only one thread to execute at a time, and the threads in the queue share one database connection. You can use
dispatch_queue_create()
To create aserial queue
, as a queue. - SQLite uses a multithreaded pattern, with each thread creating its own database connection. This situation requires opening and closing the database connection each time, so it takes some extra time. In this case, you can choose one
Concurrent queue
. Open and close the database connection each time you read or write. - SQLite uses serial mode, with all threads sharing a global database connection
SQLite
The serial mode is equivalent to makingSQlite
You maintain the queues yourself, but SQL execution is out of order, so there is no guarantee of transactionality.
FMDB multithreading
FMDB is a framework for data storage. It is the encapsulation of SQLite data under iOS platform, FMDB is object-oriented, it encapsulates THE API of SQLite C language in OC language, and it is very convenient to use. It is the most used third-party data storage framework on iOS.
FMDB uses the official iOS SQLite library, which is multithreaded by default.
Multithreading management in FMDB is done using serial queues. Use FMDatabaseQueue to manage the queue. When FMDatabaseQueue is initialized, it initializes a serial queue and adds a unique identifier to it:
_queue = dispatch_queue_create([[NSString stringWithFormat:@"fmdb.%@", self] UTF8String], NULL);
dispatch_queue_set_specific(_queue, kDispatchQueueSpecificKey, (__bridge void *)self, NULL);
Copy the code
When the database is operated on, this serial queue is used and executed synchronously. This ensures that only one transaction is operating on the database at a time for each database connection. This satisfies the requirement that an SQLite database cannot have more than one operation at a time.
Dispatch_sync (_queue, ^() {/// database operation});Copy the code
An FMDatabaseQueue is a serial queue. Even if you turn on multithreaded execution, it still executes serially. This ensures thread security.
If you want multiple threads to operate on a database, you can create multiple FMDatabaseQueue, which can be executed concurrently, although each queue is run internally in serial.
FMDB uses a serial queue and executes synchronously. If there are two tasks in the serial queue, task 1 starts first, task 1 depends on the execution result of task 2, and task 2 waits for task 1 to complete before executing, then a deadlock occurs. When using FMDB, be careful not to use it nested within tasks.
Core Data Multithreading
Core Data is apple’s official Data persistence framework. It is similar to ORM (object relational mapping), but does more than ORM. SQLite database is generally used as the persistent storage area for Core Data storage, but binary, XML and other forms of persistent storage can also be used. CoreData is enabled in multithreaded mode by default when using SQLite. In the sense that nS-managed ObjectContext can’t be used across threads, Core Data is probably implemented as an MOC object corresponding to a database connection. Therefore, when establishing a connection, there are the following rules:
- Different threads have to create their own
NSManagedObjectContext
To maintain their own objects. NSManagedObject
Object cannot be used across threads.
Since database operations cannot be performed across threads and need to be synchronized, in order to prevent deadlocks and other problems, the existing better solution is to use the three-layer NS-managed ObjectContext to operate the database.
MOC can set parentContext. A parentContext can have multiple childContext. Save operations performed on childContext push operations into parentContext. ParentContext performs the actual save operation. All changes to childContext are known by parentContontext. This solves the problem of manual synchronization.
The three-tier model allows us to do uI-related operations in the MainContext, and save operations in the child thread Context, the child thread is the background thread, and when we save, it saves the state to the MainContext, and when the MainContext saves again, It’s going to save the state in the only private context at the top level. When the uppermost context performs save, the operation to save the data to the database is actually triggered. Because the uppermost operations are performed in child threads, they do not affect the UI. The important thing to note here is that the bottom child thread Context can have multiple threads. And the application principle should be throwaway, don’t save the Context. That doesn’t cause the Context mess problem. There’s no communication between subcontexts, it doesn’t make a lot of sense, and it causes a lot of problems. We only have one MainContext for the whole thing, for uI-related operations. And the top level private context is the only one. This ensures that we have only one database connection. Now, you’re artificially making sure that there’s only one Context at the top that’s calling the database. And this Context is under a single thread. So we can ensure that the operation of SQLite is able to meet the requirements of multi-threaded mode. However, it can also be seen that such an operation does not give play to the advantages of multi-threaded mode. It’s kind of like serial mode. But this keeps the thread safe and does not block the main thread.
conclusion
There are many storage solutions on iOS. The storage solution you choose depends on the current service scenario. This paper introduces the principles of various storage methods, and then discusses the advantages and disadvantages of each scheme, and use scenarios. There is not much description of the implementation details of the various scenarios. The file storage scheme is simple to implement and useful in simpler scenarios. Database storage is suitable for complex service scenarios. Of course, there are also many pits and technical difficulties. In the next article, we will discuss FMDB and CoreData in detail.
## Reference link
-
IOS Advanced – SQLite database
-
SQLite Learning Manual (Locking and Concurrency Control)
-
Introduction to hypercomplete database classification
-
SQLite Anatomy (5) : Architecture
-
SQLite application in multi-threaded environment
-
Wechat iOS SQLite source code optimization practice
-
IOS Source code Reading -FMDB source code analysis 3
-
Several ways to persist data in iOS