Interviewer: Tell me how your company’s system is designed to prevent duplicate data.The architect must ask

2022-07-07 0 By

The project in charge of the blogger reported a problem, user operation rollback failure.In our design, operation rollback is a return to the state before operation.After viewing the log, it is found that the user performed the operation twice before, that is, the interface for submitting the operation was called twice, resulting in the same state of the user last time and this time. Therefore, there is no problem with the rollback of the operation. The problem is that the interface for the operation was called twice.For the prevention of repeated submission, it is placed in the front control. After the user clicks the button, the background returns the successful result, and the button is not visible. Practice has proved that the client limit operation is not absolutely reliable.The above scenario brings us to today’s question, what is interface idempotency?How to ensure interface idempotency?What is interface idempotency?Idempotence is a mathematical concept. When applied to an interface, idempotence means that the same interface makes the same request many times, and the operation must be executed only once.When an exception occurs and repeated attempts are made to call the interface, the system can’t afford it, so it must be prevented.For example, there are serious consequences if interface idempotency is not implemented: payment interfaces, where repeated payments result in multiple deductions;Order interface, the same order may be created multiple times.Why does interface idempotence arise?So, when does interface idempotency come up?Network fluctuations,May cause repetitive request user operation, the user may not trigger when operating order transactions for many times, even no response and to trigger multiple trading applications use the failure or timeout retry mechanism (Nginx retry, retry the RPC retry or business layer, etc.) repeat refresh page using the browser back button repeatedly before operation, lead to a repeat submit the formRepeated submission of forms with browser history Repeated HTTP requests by browsers repeated scheduled tasks repeated by users double clicking submit button How do you ensure idempotency of interfaces?So the most critical comes, how to ensure the interface idempotent?The solution can be divided into two directions, one direction is the client to prevent repeated calls, and the other direction is the server to verify.Of course, client-side prevention of duplicate commits is not infallible, and the advantage is that it is relatively simple to implement.Button can only operate a typically put button after ash or loding state, eliminate the user because of repeated click duplicate records, such as adding operation, the result of the click twice two records allow duplicates on token mechanism function, but to ensure that duplicates do not produce side effects, such as the click n times makes a record, the specific implementation isApply for a token when entering the page, and then carry the token with all subsequent requests. The backend uses the token to avoid repeated requests.Post/Redirect/Get (PRG) is used to Redirect the user to a redirected page after submitting a form. This prevents the user from re-submitting the form by pressing F5 to refreshThe existing browser form re-submission warning also eliminates the same re-submission problem caused by pressing the browser forward and backward.Special symbol is stored in the session on the server, and generates a unique identifier, put it in the session, at the same time the front-end value write it to get the identifier in the form of hidden, for the user to enter the information and then click the submit together, on the server side, get the value of the hidden fields in the form, compared with a unique identifier of the session, the sameNote If the request is submitted for the first time, the request is processed and the unique identifier in the session is removed. If the request is not equal to the request, the request is repeated and no processing is performed.Using a Unique index to prevent new dirty data Using the database unique index mechanism, when data is duplicated, the database will throw an exception to prevent dirty data.Optimistic lock If you update existing data, you can use optimistic lock update, or you can use optimistic lock when designing table structure, using version to do optimistic lock, which can ensure execution efficiency and idempotent.Update table set version = version + 1 WHERE ID = #{ID} and version = #{version} Example:When there is a duplicate request, the first request will get the version number of the current item, and the version is 1. Then, since the first request has not updated the version of the item, the second request will still get the version of 1.At this time, the first request to operate the update with version as a condition and self-increasing the update, at this time, the version of the product will become 2. When the second request to operate the update, it is obvious that the version is inconsistent, resulting in the failure of the update.Select + insert or update or delete this scheme is the operation before the query, meet the requirements and then insert, this scheme can solve the idempotent problem in the system without concurrent,JVM locking can be used to guarantee idempotency when a single JVM has concurrency, but it cannot guarantee idempotency in a distributed environment, which can be guaranteed using distribution.For example, unique fields cannot be determined. In this case, a distributed lock can be introduced. A third-party system (Redis or ZooKeeper) inserts or updates data in the service system to obtain the distributed lock, perform operations, and release the lock.In fact, this is the idea of multi-threaded concurrent lock, into many systems, that is, distributed system to solve the idea.Point: a long flow process requirements can’t execute concurrently, can before the process execution, according to a sign (user ID + suffix, etc.) to obtain a distributed lock, and other process execution time acquiring a lock fails, the process is the same time can only have one can perform successfully, after completion of execution, the release of a distributed lock (distributed lock to the third party system.State machine idempotent design relevant business documents, or a task related to the business, will surely involve state machine (state change chart), it is a state above the business documents, state in different circumstances will change, usually exist finite state machine, at that time, if the state machine is in the next state, this time to a state of a change,It is theoretically impossible to change, and thus the idempotence of finite state machines is guaranteed.Note: There is a long state flow in document business such as orders, so it is necessary to have a deep understanding of the state machine, which is of great help to improve the ability of business system design.Anti-weight table to pay for example:Use only the primary key to do the only index weight table, such as using the order number as the only index weight table, every request according to the order number to prevent heavy insert a data in the table, insert that can handle the business behind the success, after when dealing with the business logic to delete the weight the order number in the table data, follow-up if there is a repeated request, will be the only index because of the heavy tableThe operation fails until the result of the first request is returned. It can be seen that the function of the anti-replay table is locking.Note: combining with the state machine power to determine the best buffer queue requests are quickly after receiving down into the buffer queue, subsequent use asynchronous task processing the data in the queue, filter out duplicate request, advantages of this solution is synchronous processing to asynchronous processing, high throughput, disadvantage is can’t return to the request in a timely manner as a result, require follow-up polling have to deal with the results.The global unique number, for example, is passed to the back end via source + unique sequence number. The back end determines whether the request is duplicate or not. Only one request can be processed in a concurrent manner.-END- Recently is the peak of the interview, some friends, asked me to help find some interview questions information, so I went through the collection of 6T information, collected and sorted out, can be said to be a programmer interview essential!All data are arranged to the network disk, the need of small partners forward + attention after the private letter I reply can be obtained, welcome to download ~~~~ author: three points evil \ original text: