The time machine may be the ultimate weapon most used in science fiction, but it’s also the least likely to be implemented.

Now I put two constraints on the almighty time machine

  1. It can only look backward, but not forward.
  2. It can only observe, not modify history.

This emasculated version of the time machine is still powerful, and with it, history’s mysteries can be solved. It’s like having surveillance cameras installed in every corner of the world, and when you think about it, it doesn’t seem so far-fetched. I don’t know when humans will invent it, or whether it will be a blessing or a curse, but I do know it will be useful to programmers.

In fact, most programmers today already use a time machine called Git. Version control of software code is arguably one of the greatest inventions in software engineering to date, allowing us to change code without fear. Because we know that old code doesn’t disappear, we can always retrieve it in the code’s commit history if we need to.

However, code is only one part of a program. In my opinion, there are three main components of a program. Besides the program code, there are the state of the program when it runs, and its data store (such as the database used by the program). At the code level, though, we can already do time travel. But can we build time machines for the last two parts? That’s what we’re talking about today.

Immutable data

To implement a time machine, we need to save all the states that have happened in history. An important concept to mention here is Immutable Data. Its core concept is that data, like history, cannot be changed or destroyed, and it describes the state of an object at a point in time in the past. But we can add new data that represents the state of the object at a new point in time, so the state of the object can be changed, but once the data is generated, it cannot be changed. Once an object is created, new data is generated each time its state changes, and the old data is no longer used, but is not destroyed. This makes it very easy to see the state of an object at a particular point in time, so immutable data is the basis for implementing a time machine.

Unfortunately, in traditional programming languages, most data structures are mutable, such as when you add an element to an array, the contents of the array change. After that, we can no longer access the array before the change.

In recent years, however, more and more people have discovered the advantages of immutable data. This concept, once only used in functional programming languages, is beginning to be accepted by mainstream programming languages and is beginning to be widely used. Facebook, for example, has opened source the famous Immutable. Js, allowing Immutable data to enter the Javascript ecosystem. On its home page there is an overview of immutable data, which says it perfectly. If my previous paragraph confused you, take a look at the following description

Immutable collections should be treated as values rather than objects. While objects represent some thing which could change over time, a value represents the state of that thing at a particular instance of time.

A time machine for the running state of the program

“The hardest bugs to debug are the ones that can’t be repeated every time,” I think most programmers know that. What a wonderful world it would be if you had a time machine that could rewind to the exact moment the error occurred and observe the state of the program and all the subsequent events.

In the world of front-end development, this problem has been solved by Dan Abramov with his solution, the Redux framework. Not only did Dan build the framework, but the time machine itself was implemented for you in his Redux-DevTools. As shown in the image below, you can drag the timeline to see the history of the application, including the internal data state of the application, as well as every change in the interface

For those of you unfamiliar with Redux, here’s a quick overview of how it works.

  • The State of the entire program is managed uniformly by an Applicate State object whose value is immutable data.
  • Changes to program state must be made through Action. The Action may be a user input event, a network request event, or some other custom event. When an Action changes program State, it generates a new Application State instead of changing an existing one.
  • The interface of an Application is entirely determined by the current Application State.

Redux’s development tools record all the actions in the execution history of the Application and their corresponding Application states. So Redux can restore not only the state of the program at each moment, but also the actions associated with each state, enabling the debugger’s God-like perspective, as shown below. That’s what really good things are, so simple and yet so powerful.

Another interesting project recently seen in the JS ecosystem is Automerge, which addresses the problem of synchronizing application state across multiple devices. It can be used to manage application state and is itself based on immutable data, generating new data instances with each change. Interestingly, it also has the ability to Commit changes with a Commit Message and the getHistory method to view all the changes in the history of an Automerge object. Is it very Git?

Both Redux and Automerge can be used to build time machines to see how programs are doing. But these techniques are better suited to the front end than the back end, where the service itself tends to have only a small amount of state, most of which is in the data storage tier, that is, in databases or files. Different techniques are used here, which will be the focus of the next section.

A time machine for data storage

While it is important for a program to work reliably, what is more important than a program crash is data errors or loss. I’m not talking about data problems caused by database system crashes, but by logic errors in the program. A full database backup can solve this problem to some extent. For example, if you want to know what the value of a piece of data was a month ago, maybe you can pull up the backup and look at it, but for a large database, the cost of obtaining a backup is very high. If you want to know all the changes in the history of a piece of data, then backups don’t help, but you need to equip your database with a time travel feature.

In today’s era of big data, we not only want to record the results of things, but also want to record the user’s every move. For example, we want to record not only what is in the shopping cart when the user pays the bill, but also what goods the user has put in the shopping cart during the shopping process. At this point, you’ll want a database time machine to analyze historical user data.

In the previous article, we have seen that the time machine is based on immutable data, but is there a database based on immutable principle? Blockchain is one of them, but if you really want to use it to replace MySQL, forget it.

The Datomic database is the leader in this field. The name is unfamiliar to most people, but it was written by Rich Hickey, creator of the Clojure language. All data structures in Clojure are immutable, so Rich wrote Datomic, an immutable database, based on the same idea.

How awesome is Datomic? It is the database version of Git. Datomic is distributed, with no central server cluster. It is immutable, all the history of the database from its creation is recorded, and each transaction can be written to a Commit Message. It has its own set of declarative query languages, different from SQL, but also very powerful. However, because the database is a commercial closed source and is closely tied to Clojure, its impact outside of the Clojure community is limited.

As we all know, the cost of database migration is very high, and many large systems cannot migrate from existing SQL relational databases, so even if Datomic is cool, our mouths will be watering. Is there a way to achieve our goal without changing the database system? The answer is yes, and here are three ideas

  1. Implement version system on SQL database, add version number to each record, each change produces a new version of the record.
  2. Create a separate log table and add a change record to the log table each time the data changes, recording the type of operation and the changed value.
  3. Use the change log that comes with the database to save all historical versions of the record.

1) is the most expensive and can only be used for new projects. 2) Since neither INSERT nor UPDATE statements in SQL return the changed record value, additional queries are required to read the changed record value, which incurs additional overhead on the database and is not easy to implement. 3) Using the database’s own log system, I think it is the least expensive scheme, but the specific implementation method varies from system to system, the following is a brief introduction to how to achieve in MySQL.

The first step is to open the MySQL binlog and set the format to row.

log-bin = /path/to/log
binlog_format=row
Copy the code

MySQL will keep a complete record of every row inserted or updated in the database. This is the database change history we want. But since binlog is a binary format, there is also a tool to extract the records in binlog and save them to the target store. The target store could be another database, or a distributed file system such as HDFS.

Maxwell is a Zendesk open source tool that reads MySQL binlog content and sends it to message queues such as Kafka, Redis, and RabbitMQ. Configuration is very simple, the official website is also written very clear, here will not repeat. For each data update, its output is a JSON object, for example

mysql> insert into test.foo set id = 1, name = 'fire';
maxwell: {
  "database": "test",
  "table": "foo",
  "type": "insert",
  "ts": 1449786310,
  "xid": 940752,
  "commit": true,
  "data": { "id":1, "name": "fire" }
}
Copy the code

By persisting the JSON generated by Maxwell, you can query any value recorded in the original database at any point in time, thus creating a time machine for the existing MySQL database.

The time machine of life

At the end of this article, I’ll turn to the imagination. It’s great to build a time machine for programs, but it would be great to build a time machine to record your life. In fact, I think that day may not be too far away. We can put on a pair of smart glasses and record all the videos we see and hear constantly. We can upload the video stream to the cloud for permanent preservation through wireless network, so that we can recall a certain moment of our life at any time.

Not only that, because every video has time and location information, AI can also analyze the content of the video, such as identifying the characters and scenes in the video or extracting the theme of the video. So we can do all kinds of searches on the timeline. For example, someone like me who has lost a lot of things, the next time I can’t find something, I can just search for the video footage associated with it, which solves a major pain point in my life.

Most of the technologies mentioned above are already there, what’s missing is the ability to take a first view record every second of every day, and I look forward to that day. However, before having a life, I still suggest that you write more diaries in your daily life, recording your life bit by bit and whim, which is a low-profile version of life time machine that everyone can have now.

That’s all for today’s sharing. Welcome to leave comments and discuss.