Abstract: With the popularity of MPC, private computing and other concepts, many government agencies, financial enterprises began to consider participating in multi-party computing scenarios, to expand the application value of data.

This article is from The Huawei Cloud community “Using PSI to Solve the Data collision problem of Federated Computing” by breakDraw.

Federated computing scenario

With the popularity of MPC, private computing and other concepts, many government agencies and financial enterprises begin to consider participating in multi-party computing scenarios to expand the application value of data.

In the following scenario, for example, a bank may want to obtain data from the WATER and Power Department and its own bank’s depositors to synthesize the credit rating of each company.

Then the bank might want to execute the following SQL to get a credit score.

Select 0.5* C. Grant amount *0.3+0.4* A. Discount amount *0.3+0.2*a. Amount of target *0.3+(0.05*b. Amount of water fee paid +0.05* B. Amount of gas fee +0.05* B. Amount of electricity fee)*0.1 from partya. tax A. partyb. amount B on A.id = B.IDCopy the code

The problem

In the above federated computing scenario, the join operation is required to associate the water and power Bureau data with the bank data. In the traditional scheme, collision operation will be carried out in TEE to get associated data and then calculation.

But the number of users of the water and power bureau is very large, while the number of depositors of the bank is relatively limited. Therefore, the actual related number is based on the number of bank depositors.

If all the data of water and power bureau are uploaded to TEE, the transmission cost between software and hardware will be very high, and the sensitive data of non-associated records will also be brought up in this process.

The identity of a bank’s customer may also be highly sensitive.

To solve

Using PSI (privacy protection set intersection) can effectively solve the above two problems.

PSI usually has the following three characteristics:

  • Semi-trusted scenario: Data parties do not want to expose all data, but only want to obtain the intersection of data sets

  • Data minimization: Data other than the intersection of data sets cannot be disclosed to either party

  • Secure computing: Both parties involved in computing need to jointly implement a set of secure computing protocols to ensure data security.

The specific flow diagram is as follows:

This process can ensure that the ids of party A and party B collide in the pure ciphertext scenario, get the associated ID set, and output based on this.

application

The current federated computing service of TICS supports psi applications.

On the Alliance management page, the administrator enables High-level Privacy Protection. When enabled, if the SQL statement of PSI-JOIN meets the requirements, TICS will use PSI to construct the execution plan, carry out JOIN collision, and then continue the subsequent calculation.

Create a job and execute the corresponding sqL-JOIN job

When the job is executed, you can see the DAG diagram of the TICS system, which shows the whole process of PSI. The output result is the same as that of the direct join.

Click to follow, the first time to learn about Huawei cloud fresh technology ~