preface

background

  • General Undergraduate (not 985/211), Computer Science and Technology;
  • Class of 2019, 2 years of experience;
  • Back end R&D in a traditional medium factory in Guangzhou;
  • The recruitment this time is for social recruitment;

Knowledge reserves

  • We often say that “the interview builds the rocket, the entry screws”, indeed the vast majority of the interview investigation content may not be used in the work, commonly known as “eight essay”;
  • But often a lot of in-depth knowledge can help us to provide the necessary knowledge reserve when we encounter more difficult problems, improve our thinking ability, to improve our knowledge system;
  • Knowledge sources can be through high score professional books, bloggers/public number/blog summary, etc., as well as algorithms (Leetcode brush up, suggested 100+);
  • Mind mapping can be used to improve our knowledge system. It is generally very efficient to detect and fill gaps (but it also varies from person to person, so it is recommended to try ~).

  • In the interview process, from communication, knowledge system, business thinking, each is the test and improvement of their own growth, timely review after each interview, straighten out the mentality, check and fill the gaps;

Face the

Fordeal

Business: cross-border e-commerce, supply chain; Conclusion: The salary was not negotiated, and HR was hung up;

One side (40min)

  • Self-introduction, project introduction, overall service level, business logic flow, what are the upper business, project difficulties, how to solve?
  • Dubbo:
    • The common Dubbo filter(ValidationFilter, exceptionFilter);
    • How does custom injection work? (Dubbo SPI injection, chain call);
    • Can I modify the default Settings in Filter? Such as communication protocol/serialization. Dubbo initializes the communication protocol/serialization mode through Spring configuration, which should not be modified in Filter.
  • ES:
    • Used in business scenarios? (multi-conditional associative search + full-text search requirements);
    • Index storage design, wide table, with DB why not achieve? (design idea, reduce the redundancy of the wide table + avoid join index);
    • What are the delay points of DB->ES heterogeneous data synchronization (DB->Canal refers to the primary/secondary delay of MySQL, MQ delivery consumption, ES write buffer reported, loss before translog anti-flush disk written, flush to OS Cache, visible for 1s);
    • How to avoid delay in strong real-time scene? (DB, ES near real-time cannot be guaranteed), what is the synchronization processing speed?
  • Spring:
    • Spring Schedule principle (built-in ScheduledExecutorService);
  • MQ:
    • Heterogeneous data and repeated problem reporting (idempotent processing);
  • Java:
    • YGC frequent scenarios (sudden increase of traffic, slow DB getConnection, unstable DB connection number, no slow query, high YGC frequency, long STW time, consistent log time, temporary measures -> instance expansion, increase of young generation space, overall unclear description);

Second face (50min)

  • Self-introduction, current salary – what is supported by the corresponding ability dimension, and the most satisfied design/module in the project;

  • Java:

    • JVM, quota, young generation, old generation, heap memory;
    • YGC process (object production is too fast, GC recovery is too slow, young generation -> Eden, S1, S2-> old generation, allocation guarantee, cross GC phenomenon, this part is not well explained);
    • GC troubleshooting process (APM, alarm, Jstat-gcuTIL PID, Jstat dump, mat, compare with service exception time point, determine heap memory large objects, cause, and resolve);
    • Gc dump causes an application memory avalanche. Set the dump policy to converge by frequency threshold. PrintGCDetail (printGCDetail);
    • CountDownlatch, cycleBarr difference (a group of multiple threads synchronization, a group of multiple threads synchronization, ask the thread pool 100, CountDownlatch 1000, and cyc1000, which will execute normally, which will block);
    • Wait, notify & Semaphore
  • MySQL:

    • Table (secondary index lookup -> primary key ID -> primary index + random IO lookup complete data);
    • How to reduce back table (overwrite index), benefit (reduce back table random IO lookup), disadvantage (index update, storage cost);
    • Lock, row, or index. A little meng), said the lock line record lock;
    • Explain the difference between using index, using condition index, using index where the condition needs to go back to the table
  • The service:

    • Common microservice structure layers (service nodes, registries (service registration and discovery, update), gateways, monitoring, components, functions)
    • The core function of gateway (talking about traffic limiting, unified traffic entrance, the interviewer prompts from RPC level – do unified protocol encapsulation and conversion, RPC process);
    • Http1.1, 2 distinction (long/short connection, procedure, distinction; Header compression, binary streams, TCP connection reuse for HTTP requests.
  • Es:

    • Inverted index structure; (participle key, document ID value)
    • Query optimization (hardware, index structure design, DSL optimization)
    • Query process (queryAndFetch);
    • Master shard aggregation results too large how to handle (paging restrictions, a single node has. The interviewer prompts primary shard priority queue).
    • Sharding function (data redundancy and load query)
    • Dynamic fragment capacity expansion (Secondary shards are directly expanded. Primary shards are expanded. New primary shards cannot be written to ensure data consistency? The number of master fragments cannot be changed, and the copy fragments can be adjusted. It can be expanded by cluster nodes, and the master fragment data can be re-routed to new cluster nodes.

HR surface (40min)

  • Current form of organizational collaboration, cross-departmental communication, process;
  • How to promote the lack of coordination between departments;
  • Future development plan and business field direction;
  • Expected salary;

YY

Business: YY Live broadcast; Conclusion: After the meeting, the department said that there was no HC, so help to transfer the department, there was no follow-up, one side hung;

One side (40min)

  • Project:

    • Service troubleshooting process (APM link, log -> Specific problem, slow SQL analysis/network time consuming /STW->GC problem troubleshooting);
    • Jstack: jstat -gcutil pid: jstat -gcutil pid: jstat -gcutil pid: jstat -gcutil pid
    • Full-link pressure test process (determine business scenarios, use cases, pre-estimation, deployment environment, dependency shielding, multi-round pressure test, problem discovery, targeted solution);
    • How to implement link transfer? (traceId is injected through agent, and the request is carried in the protocol when forwarding, so it has no interest in service development.)
    • The interviewer seemed not very satisfied and asked how to bury different agreement requests;
    • (Service session delivery, via request thread threadLocal)
      • What about asynchronous threads? Say by variable? He said that it would need business code access, not suitable (see the internal implementation, through agent injection + HTTP header/ dubbo rpcContext+ log component parsing record);
  • Sentinel:

    • Flow limiting implementation, which flow limiting algorithm (leakage bucket, token bucket);
    • If it is designed by itself, how to do it should be simple and lightweight (as mentioned, single-node traffic limiting, through starter injection, automatic configuration of traffic limiting degradation rules, loading into memory, through interface proxy to do exceptions, time statistics, Dubbo service can also be implemented through filter interception, blah blah blah).
  • Java:

    • HashMap collection structure (blah blah);
  • MySQL: 

    • Transaction isolation level, & default level (InnoDB stores RR under the engine and can prevent phantom reads);
    • MVCC implementation, version chain lookup process (different transaction ids, min_IDS, active transaction ids, next ID, compared with version chain transaction IDS, blah, blah);
    • How to implement RP (front +ReadView view generation)
  • Dubbo:

    • Dubbo + Hession and RMI+JDK serialization;
    • How is a Protofer implementation different from hession?
    • Network protocol level how to achieve load balancing (do not know what to ask, protocol is network communication exchange, and load balancing is not associated with…)
  • Spring:

    • Transaction passing, how to use in production (Spring transaction passing uses);
    • Whether the synchronous thread can read the variable written to the transaction (the current transaction can), or whether the asynchronous thread can read the variable written to the transaction (the current transaction can).
  • Redis

    • Data structures (string, list, hash, set, zset)
    • HA, Sentinel, talk about the fault discovery and transition process (blah blah blah);
    • How to discover the new redis nodes added to the node (did not remember at that time: in Sentinel mode, new nodes can be discovered through cluster communication, and in cluster mode, new nodes can be assigned through slots);
    • Did Redis Pipeline work? Talk about it (no)
  • Logic:

    • Telephone record book, the content is very much, how to achieve statistics of the same number of telephone records, and topN records (hashmap-key is the phone number, value is the number of occurrences ++, topN through the queue insertion separate storage ,,,, should be able to use the ordered map to save, at that time did not think of…)
      • What if I run out of memory? The interviewer did not continue to ask questions about the group statistics, such as p2-p2 comparison to divide and conquer.

Jinshan WPS

Business: Cloud platform, partial middleware; Conclusion: Get the offer;

One side (83min)

  • Java:
    • Collection, ArrayList, LinkedList (distinction, usage scenario), HashMap structure, header and tail distinction (Hash collision chain collision node joining method), what problem to solve (infinite loop when concurrent), red-black tree conversion condition (array length >64 && list length >8), Query time complexity (logN);
    • Concurrency, CorrentHashMap structure, piecewise Lock implementation Lock, Synchronized keyword implementation & difference (Java class library implementation, JVM level implementation, Lock manually unlock, code block, Synchronized can be in code block, method Lock, difference, implementation principle, Acc_synchnorized, lock object header monitor enter, exit);
    • What is the problem with CAS frequently (CPU spike -> lock upgrade)? Discuss the lock upgrade process (bias lock – lightweight lock – spin – heavyweight lock).
    • Java thread pool (worker, boss thread), working principle, changes in the number of threads at each stage (core thread count – queue – maximum thread count – rejection policy);
    • The JVM memory model, the role of various regions, the heap, and what data is stored in the heap (all covered in Understanding the JVM Virtual Machine);
    • Garbage collection process, algorithm, what problems will lead to (mark clearing – memory fragmentation, early GC), generation collection corresponding algorithm, GCRoot process, which objects are associated, how to determine an object can be collected (object no reference, GCRoot unreachable);
    • CMS, advantages (concurrency, short STW time), garbage collection process, STW phase (initial tag, final tag), disadvantages (CPU resource sensitivity, tag clearing leads to floating garbage);
  • Redis:
    • Lock the scene, process, use command, talk about get, setnx+expiretime, del, atomic implementation (lua merge command, the interview said that the earlier version is, the later is not, the official provided atomic instruction level implementation);
    • Setnx meaning (NX forgot, the interviewer guides if not Exist, asking what the problem would be if there was no atomicity)
    • T1 takes a long time to execute, reaches the expiration time, and is acquired by other locks. How to solve this problem? I can only guess that the asynchronous thread timer, renew before the expiration time, according to a certain policy to terminate the renewal action; Key is the lock variable and value is the UUID or version number value. When deleting, determine whether the lock is held by the current thread to avoid deletion by other threads.
    • Is Redis concurrency security considered? Redis single thread model, single event queue, no race condition)
  • Network protocol:
    • HTTP, TCP level, relationship (application layer, transport layer, network packet layer by layer packaging delivery);
    • TCP, UDP difference, TCP three handshake, four wave process; What is the problem with three waves and what is the state of the client and server during the four waves? ;
    • After the server initiates a FIN, the client directly closes after ack, which takes 2 MSL. Why is this design necessary (to ensure that the server ack resends the confirmation in case of network congestion)?
    • TCP sticky packet problem, why will occur sticky packet (continuous byte stream transmission, no boundary), which layer (application layer protocol) unpacking occurs;
    • Sliding window process, congestion control implementation method (slow start, ask how to determine the current lifting rate, when it comes to round trip packet time confirmation; How to confirm, send end timer, statistics, the interviewer seems not satisfied ~, behind the congestion control algorithm details for the moment I can not remember. ;
  • MySQL:
    • Index structure (data pages, leaf nodes, non-leaf nodes, orderliness, in-page data);
    • B+ tree (why not B tree, talk about B tree non-leaf node data storage, tree height problem);
    • Common SQL optimization (speaking of index, overwrite index, index ordering, left-most matching, explain), what is the process of back to the table (secondary index + primary index), how to optimize (SELECT + overwrite index);
    • Business sub-database and sub-table scenarios, problem points, cross-table query, whether there is any use of read/write separation, master-slave structure (answered the slave library is only for backup, to avoid the master-slave delay);
  • ES:
    • Talk about ES, inverted index, sharding structure, why fast (key is a participle, value is a document ID, distributed storage, query load, flexible expansion);
    • Search process (client, request, coordinate node, forward, local query, return);
    • Query optimization method (service single index design, field type design, master and sub fragment configuration, redundant backup + query load, DSL optimization, scoring/non-scoring query, service query quantity limit);
    • Performance bottlenecks encountered (small amount of single index data, business index differentiation, not yet);
  • MQ:
    • Usage scenarios, service decoupling, peak clipping;

Second side (40min)

  • MySQL:
    • Index, order, failure scenarios, common optimization methods (index structure, sequential lookup, rational establishment and use of indexes, explain analysis);
    • Student table – student id, height, weight three columns, query height > 1.7m, and sort by weight, how to do;
  • Redis:
    • Add unlock process; (Check whether it is held by the current thread, try to lock setNx+Ex, del lock after the end)
    • How to handle if the lock is acquired by other threads after the thread times out (you can add the lock to the key by the thread ID, and judge when adding the lock to ensure that the expiration will not be acquired by other threads, resulting in abnormal service operations or the daemon thread renewing in advance);
  • Full-link pressure measurement:
    • Implementation process, how to improve service indicators;
    • What the asynchronous scenario looks like; (Asynchronous message push, via asynchronous thread /MQ)
    • Does the volume of DATA written into the DB account be considered in the pressure test? (The volume of data written into the DB account must be consistent with that of the production cluster. For the flow data query made date level table);
  • Sentinel:
    • Implementation principle; (Entry Entry, Slot chain construction, traffic limiting downgrade chain call);
    • What are the traffic limiting degradation methods (RT, outliers, time consuming) and traffic limiting/statistical algorithms?
    • Sliding window, how to achieve the time window;
  • Project:
    • Design process;
    • What dimensions to consider (research, requirements analysis, getting business elements, decoupling each module as much as possible to improve flexibility)
    • Design pattern usage, strategy pattern scenario, what the process is;
    • How to optimize if there are more than one if nested in a project (answer the chain of responsibility pattern, passing to the next node after each processing)

Three sides (43min)

  • Algorithm:

    • The number that occurs only once

      • Map(num, count); Map(num, count);
    • LRU

      • Access linked list, LRU linked list; Update LRU list header update LRU list header update LRU list header update LRU list header update LRU list header

      • The LRU list does not support random access, it can only be traversed, or indexed by an extra hash array that points to the LRU list node. ;

    • Arithmetic operation, given a valid formula, the result of the calculation

      • Through the stack structure, the calculation is disassembled and put on the stack, the number is directly put on the stack, and the matching parentheses are put on the stack, and the calculation results are put on the stack until the end;
    • Figure to find the maximum value, the path is the weight, node is the value, find the maximum value of the path, can not fork multiple nodes at the same time – in fact, is a tree;

      • DFS traversal is used to obtain the maximum value of A path, and then traversal B path to update the MaxValue value. Later, the second dimension is used to store the accumulated value

      – The interviewer said it was not very good, and later explained whether to choose nodes to do statistics with the second dimensional storage.

  • Project:

    • Assume a part of the scenario, ask how to achieve, investigate the project scalability, data consistency implementation (very in-depth, not to expand);

HR surface (20min)

  • Self-introduction, project information;
  • Communication and collaboration;
  • Salary, location confirmation;

Graffiti smart

Business: IOT & IOT test platform infrastructure; Conclusion: After a busy, refused;

One side (30min)

  • NIO:
    • NIO, zero copy implementation principle, advantages, what will be the problem (file IO/ call IO, blocking/non-blocking, synchronous/asynchronous, through the network call zero copy implementation process);
  • Dubbo:
    • Service registration discovery (describes the overall structure, including the registry of service registration discovery, ZK listening mechanism implementation, service production and consumption);
    • Call process (service discovery, invocation, serialization, method wrapping, chain call, Invoke, processing, layer by layer return, deserialization);
    • How to call multiple provider scenarios (talked about four load balancing strategies, scenarios, fault tolerance, asked about consistent hash, and implementation);
  • Zookeeper:
    • Usage scenarios (covering registries and locks)
    • Node data structure (file tree), type (temporary, persistent, sequential node collocation)
    • Election process (not familiar with ZK, refer to Redis’ sentry election process for analogy)
    • How to deal with the problem of split brain (trxId comparison);
  • MySQL:
    • Scenario: query, SELECT column, where condition column in and not in the index column respectively query process (not in the index column many back table process);
    • Index data structure, back table, solve, random/sequential IO problems;
  • Java:
    • JVM, common garbage collection algorithms, features, usage scenarios, procedures;
    • CMS garbage collection process, what is done at different stages (initial marking, concurrent marking, re-marking, and cleanup), STW, GC/ user thread execution at different stages;

(After 10min, hr sister called to make an appointment for the next round, just happened to have a big demand for the job, there were also interviews with other big factories, couldn’t resist, refused)

Cool dog music

Business: Cool dog K song; Conclusion: Get the offer;

One side (30 min)

  • Full-link pressure measurement:

    • Process (described the pressure test scenario, use case sorting, index confirmation, pressure test environment construction, samples, abnormal/time-consuming ratio, APM link investigation time point);
    • How to ensure the three SLA 9, what are the statistical dimensions (service unavailability time, abnormal failure, time-consuming ratio, automatic collection statistics);
  • Fusing current limit:

    • Indicators (talking about outliers, time consuming, proportion statistics);
    • Fusing measures (return NULL, throw exceptions, set the fusing period to reject, then automatically restore);
    • Talk about service avalanches (microservice calls, link upstream and downstream, one node exception, drag down the normal node, at the same time, the service will have fault tolerance processing, such as retry, may aggravate the request pressure, and eventually affect the crash of multiple normal services)
  • MySQL:

    • Query optimization, joint indexing, reduced back tables, etc. (blah, blah, blah);
    • Given A, B, A + B field query, how to build index (A_B_index B_index, first through joint index, the second directly go B single index)
    • Sub-database sub-table: given a 500 million order table, query the total number; (According to the date, obtain the size of the sub-table count of each order, summarize the statistics, see the business scenario, there is not much line; The order id hash is used to set the table step size statistics.
  • Cache:

    • Data structure, cache &DB consistency guarantee (talk about the bypass cache mode, cache query, update process, final consistency, complete strong consistency and lock guarantee)
      • The interviewer seems unsatisfied with the simplicity of the scenario and expects complex object caching and strong consistency to be applied.
  • Troubleshooting:

    • Troubleshooting CPU surge (TOP, top-HP PID, JStack-L TID)
    • Thread stack analysis scenarios (code deadlocks, loops, etc.);

Second face (20min)

  • Introduce projects/modules that you think are well designed;
    • It describes xx service, based on what considerations to develop (data volume, upper business troubles, platform level analysis, research, development, promotion, business access);
  • ES:
    • Selection (compared with MySQL, business requirements, scenarios, ES segmentation features, search performance, community activity, operation and maintenance affinity and other dimensions are described)
    • Optimization mode (operation and maintenance level: multi-fragment copy, query load and storage redundancy; Index level: index template design; Query level: DSL optimization);
    • How to solve the burr phenomenon, ES data amount, cluster configuration, etc.
  • Canal:
    • Synchronous process, how to ensure the reliability of synchronous link (Canal monitoring + service scheduled scheduling monitoring, automatic expansion of message accumulation, dynamic adjustment of pull batch parameters, message ACK/ROLLBACK guarantee);
    • Why to use MQ peak clipping (in the big data quantitative change more synchronous scenario, using MQ to ensure stacking reliability, peak clipping to reduce service synchronization pressure), how to deal with external service anomalies (customized exceptions at the service level, business processing)
  • Other:
    • Years/views on the platform, team projects, communication on cooperation, business introduction of the other party, how to view overtime, etc.;

Three sides (30min)

  • Project:
    • Talk about the overall process of the project, design, selection, thinking process;
    • How is Es query and synchronization implemented (queryAndFetch procedure, participle store, inverted index);
    • In the process of project development, how to lead, involving business side, internal, external, demand, plan, review, communication, coordination, risk, schedule control and how to do;
  • Spring:
    • IOC implementation process (After the blah, blah, blah, the interviewer said that you can go into more details: including lazy-loading judgment, cycle dependent process step by step. Meticulous process not connected)
    • How to use Spring (ServiceBean, custom component encapsulation, automatic configuration starter, springMvc related, JDBC layer abstraction encapsulation, extension);
  • Redis:
    • Selection, usage scenario (lock/cache), business Redis plus unlock process (GET, Check, setex, delete);
    • How to handle when the lock is preempted by another thread during task execution (determine whether the lock is held by the current thread by value);
    • How to handle when the task execution exceeds the expiration time (daemons of the current thread are created + locks are renewed in advance + threads are released when the task is finished);
  • MySQL:
    • Index tree structure (B+ tree expansion);
    • Time-consuming troubleshooting process (APM location, explain analysis);
    • How does HA work (active/standby);

Bytes to beat

Business: Douyin Live streaming; Conclusion: one side hanging;

One side (70min)

  • Java
    • Thread pool, core parameters, change process, is there any way to go to the maximum thread before the queue (did not expect, later learn Dubbo thread pool implementation, for RPC calls, IO intensive, priority allocation of resources processing tasks);
    • The interviewer asked if there were any benefits to the JDK thread pool with him, speaking of task types, IO, CPU intensive, resource intensive, it seems unsatisfactory), scenarios, blocking queues;
    • GC, CMS, garbage collection process, STW phase, initialization and re-marking why STW (incremental garbage marking) is needed;
    • The difference between ClassNotFoundException and ClassNotDefException is what happens (compile-time/runtime classloading lookup);
    • Check the High Java process usage (using common commands).
  • Mybatis
    • Mapper interface how to implement the query (about XML binding proxy), asking is static or dynamic proxy implementation;
  • Spring
    • How to resolve loop dependency (level 3 cache, procedure)
  • MySQL 
    • Index type, cover index, scenario (explained the query process, secondary + clustering, reduce random IO);
    • Isolation Level Implementation (MVCC);
    • Select * from table for update; select * from table for update;
    • Select * from table where a= ‘x’ and b= ‘x’ and c! = ‘x’ order by d limit 0, 10;
  • Network:
    • HTTP common status code 301,302 request forwarding address in the packet which part, cookie; (HTTP header);
    • HTTPS principle, encryption process (RSA handshake, application layer, transfer layer, SSL encryption, client, server, CA certificate verification); Do you know which encryption algorithms;
  • Redis 
    • Elimination strategy (Random, Lru *2, ABord);
    • Delete policies (timed, lazy, scheduled);
  • algorithm
    • Binary tree serialization & deserialization; < span style = “font-size: 14px; Use serialization preorder, mid-order node, and then deserialize the tree; Didn’t write well)
    • Leetcode-cn.com/problems/se…

Ali.

Business: Cainiao Network; Conclusion: Got offer;

One side (42min)

  • Project :(with lots of details)
    • Participated in the project, role in the team, responsible items, background;
    • Talk about the difficulties in the project;
    • How much service throughput, synchronization bottleneck, how to solve;
    • Search engine selection, DB data synchronization selection (business scenarios, features, community activity, operation and maintenance affinity), ask whether the operation and maintenance team or their own deployment (Canal server, Es operation and maintenance deployment, client +DTS self-implementation);
    • How to deal with the advantages and disadvantages of work;
  • Java
    • Synchronized usage scenario, principle (race variable synchronization, lock code block, method, monitor);
    • Volatile, implementation principles (from visibility implementation and atoms);
    • Java reflection implementation (reflection mode, class checking, loading process);
    • JVM runtime area, specific role (common eight article);
    • How FGC happens (YGC promotion old age space is not enough, allocation guarantee space is not enough, system.GC, etc.);
      • Does system.gc always start FGC (not necessarily, just notify JVM);
  • MySQL
    • Storage structure (familiar with InnoDB engine, index B+ tree)
    • Why use B+ trees? (tree height problem, directory item node stores more associated child nodes, bidirectional linked list range query, reduce disk IO by page loading);
    • Transactions, isolation levels, implementation principles (MVCC, undoLog, version chain lookup process);
  • Canal
    • Working principle (simulate Mysql slave interaction protocol, binlog master-slave synchronization, data parsing)
    • Underlying architecture (unclear)
  • Design patterns
    • Familiar design patterns (template method), business usage scenarios;

Second face (20min)

  • project
    • Project design process, roles, related parties, difficulties, how to solve, and how to solve the domain knowledge (the project was reviewed one by one);
    • Which design patterns and scenarios are used (template method patterns are described);
    • Usually learn what technology, source code (said MQ with ES, did not continue to ask deeply…)
  • Dubbo
    • Service, A->B, concurrent calls
    • The timeout circuit breaker mechanism is not implemented in the Future and Condition class. The timeout circuit breaker mechanism is implemented in the Future and Condition class.

Three sides (30min)

  • project
    • Self-introduction;
    • Pick a project that you think is most challenging and growing, and how the business content storage structure/parsing is implemented;
    • Service allocation, resource allocation, instance number;
    • Current company size, business, r&d ratio;
  • Es
    • Storage structure (inverted index), word divider, language/engine implementation (Java+ Lucene);
    • Realize the search principle (request initiation, coordination node, routing, node query load, fragment redundancy), ES fragment, structure, whether it is complete data;
    • How persistent (memory buffer->transLog->OS Cache->Disk);
    • Memory hotspot data retention (elimination strategy);
    • If the data is distributed among different machine nodes, how to ensure that the specified node is queried (document ID routing), if it is based on the keyword, how to route without document ID at the beginning;
    • (_analyze), if the word segmentation result does not meet the expected, how to adjust, such as address [Zhejiang] is split into [Zhejiang][Jiang])

Written test (20min)

  • Engineering coding questions;

HR (40 min)

  • Self-introduction;
  • Project introduction, role assumed, responsible module is what, upstream and downstream;
  • Talk about your strengths, what you are good at;
  • Product advantages, highlights and roles of business side;
  • What are the business competitive products, characteristics, differences, how to catch up;
  • What are the difficulties I have encountered in the past two years, how have I solved them and what have I gained?
  • Whether there is practice in project/business innovation;
  • What kind of personality do you have?
  • How to view policy supervision, whether it affects business development, how to improve;
  • Current & expected salary, do you have any other offer options, and what do you know about the department you are going to work in?

Ali.

Business: Lazada, E-commerce Southeast Asia, Infrastructure Department/Transaction payment; Conclusion: HR process terminated;

One side (55min)

  • project
    • Introduce myself, consider challenging projects, what are the problems, how to solve them, and plan design;
  • Canal
    • What is the Compact row format for MySQL? Said I forgot. Enable binlog+ROW mode);
    • DTS synchronization consistency, delay processing measures, how much TPS, how to improve;
  • Es
    • Cluster: How to expand node capacity (alarm + Fragment capacity expansion);
  • MySQL: 
    • How ACID features are implemented (ACID meaning, transaction isolation without interference, MVCC, log + flush persistence)
    • MySQL > undo log undo log redo log binlog
    • Undo log (transaction rollback undo, MVCC, RollPointer reference)
    • Redo log (sequential write, persistent, 16KB full page FLUSH IO)
    • SQL >BufferPool->undo log->redo log-> transaction commit ->binlog ->redoLog commit
    • Storage engine index structure (B+ tree structure, multi-way search, leaf node data), time complexity (binary logN);
    • B tree structure, why not use (data non-leaf nodes, tree height and bottom, memory page and disk page exchange efficiency is high);
    • Lock type, what is the trick (T1 multiple query, T2 insert data, T1 result set quantity change), how to solve (RecordLock, GapLock, nextkey-lock);
  • Redis:
    • Data structure (common 5, string, list, set, zset, hash), and (bloomFilter, Geo), business usage scenarios;
  • Java:
    • Principle of Synchronized implementation (object header, Monitor object, code block: Monitor Enter, exit, method: ACC_synchronized, implicit Monitor);
    • Have you learned about the thread pool ForkingJoin (not familiar, not used in production, did not continue to ask);
    • What are the new features of the current JDK version (8) compared to version 7 (streaming operations, lambda expressions, enhanced functional programming)?
  • Spring:
    • Spring-boot (” key-value “); spring-loaded (” lookup “);
    • After an API to implement the DB data into a three-way call fails, data rollback, how to implement (added to the same Spring affairs, @ Transactional)
    • > > < span style = “max-width: 100%; clear: both; MANDATORY)
  • Algorithm:
    • Integer unordered array to find the second largest number. . Int num1, int num2, a layer for loop comparison, maintain num1, num2, return num2);

Second face (55min)

  • Project:

    • Self introduction + Project introduction, difficulties, roles, things, what do you do in it;
  • Es:

    • Selection, compared with Solr, Lucene, the same and different, problems are generally operation or development processing;
    • Es architecture, talking about the storage structure (inverted index), distributed, redundant + query (master and sub sharding design, query routing, sharding);
    • Whether there is a performance problem (DSL slow query, other services in the same cluster resource occupation), how to deal with (DSL optimization, according to the priority of migrating some services), what is the root cause of the performance problem (CPU, memory occupation, disk swap IO, etc.);
    • How to do the expansion of ES node (it seems that the number of master fragments cannot be changed, it is supposed to add nodes, rehash fragments, route migration data), whether it will lead to downtime (no), whether cluster migration has been processed, and how to achieve it;
    • How to cooperate with ES and MySQL, the selection difference, whether only ES can not use MySQL (relational database, non-relational database, depending on business scenarios (such as word segmentation search, data volume, query requirements), storage requirements, generally combined with DB+ES);
  • RocketMQ:

    • Usage scenarios (service decoupling, peak clipping);
    • Component structure (producer+comsumer+nameServer+ Broker), how to communicate between different structures (nameServer-> Broker heartbeat, producer and nameServer discover the broker, Messages are loaded to the broker side and consumed by comSumer.
    • 10 Broker20 consumers, 10 Broker5 consumers how to consume (Broker lock, consumption load);
    • Sequential consumption implementation principle (message production end queue routing, storage end FIFO, consumption end plus broker, queue lock consumption), whether the source read;
  • Scene:

    • Pay orders – “call (RPC) how to ensure data consistency, regardless of the time delay, a call request loss scenario how to check and restore (says the asynchronous call + polling results, check order or orders after the completion of the notice to pay end, caller retry, etc.), the interviewer said some complex, finally say a blocking call directly to obtain the results…
    • Implementation of business consistency, landing scenario (DTS using MQ reliable consumption + retry implementation), 2PC, 3PC, TCC scenario;
    • Order ID generator design (business meaning, divide ID segment, one part can use snowflake algorithm to generate unique ID, add order date class business number in the middle segment, performance perspective, can use the number generator one-time allocation number segment, reduce frequent request generation);
      • Do you know about the implementation of the Snowflake algorithm?
  • Other:

    • My usual learning status, what I have learned recently, my personality, the difficulty of the project, how to do business communication, and the relevant parties, etc.

Three sides (40min)

  • Project:
    • Project introduction, domain division and design basis, difficulties, how to solve, optimization, structural level, principle of dependent components, in-depth questions;
    • DB->ES data heterogeneity consistency and delay problems, how to solve respectively;
  • ES:
    • Optimization (hardware configuration, business index design, DSL optimization);
    • Fragment storage (backup HA+ load query);
    • What problems can be caused by massive write/query, and how to solve them (periodically brush transLog, disk, do cluster/shard load);
    • Segment file brush disk frequently how to solve;
  • MQ:
    • Sequential consumption implementation principle (production end, storage end, consumption end), source code implementation;
    • Message storage structure & Functionality (CommitLog, ComsumerQueue,IndexFile);
    • MQ transaction message implementation principle (implementation process, semi-transaction message flow);
  • Java:
    • Lock type (Synchronized->JVM level, Lock->AQS implementation, expand)
    • Why ReentrantLock needs to support reentrant, how it works (refetch race resources, same thread safety, AQS State++ — judgment);
    • CMS and G1, similarities and differences, selection of production practice;

Four sides (35min)

  • Project:
    • Introduction, difficult points, how to solve;
    • English communication skills;
    • Capital security scenario;
  • ES:
    • Search engine selection, deep paging scenario optimization, Scoll cursor principle, date range query;
    • DB->ES data heterogeneity, consistency/delay processing, message stacking optimization strategy, message priority judgment;
    • Synchronous exception processing, fault tolerance (consumption blocking, monitoring + storage service ID to Redis retry periodically);
  • MQ:
    • Reliable consumption &Ack, ROLLBACK mechanism (ComsumeQueue consumption point confirmation);
    • How to ensure data consistency (synchronous, asynchronous, log writing) when the broker is down.
  • Cache:
    • Breakdown, penetration, avalanche scene, how to solve (common eight-part article ~);
  • Canal:
    • Synchronous optimization, customized client MQ delivery implementation;

HR side

  • Time to communicate problems and give up the process;

tencent

Business: PCG, e-commerce; Conclusion: one side hanging;

One side (85min)

  • Algorithm:
    • Sum of three: leetcode-cn.com/problems/3s…
  • Project:
    • Architecture, HOW does ES data heterogeneity work, synchronization speed, latency;
  • java:
    • The JMM memory model (a bit awkward at the time);
    • Thread pool scenario, model, core parameters, related data structure, principle, how to implement wait structure.
    • Synchronization and Lock implementation principle, difference, use scenario (common eight-part article ~);
  • CS:
    • Operating system user mode and kernel mode, the difference (concept ~);
    • Zero copy principle, common implementation (Mmap,sendFile, lists user data change -CPU copy -DMA operation – buffer – nic, procedure);
  • Redis:
    • Data structure;
    • Cluster Cluster implementation, slot assignment, how to assign new nodes (slot assignment, node reallocation process);
    • Progressive hash migration process (H0,H1);
    • Bloom filter implementation (bit array, multiple hash function);
  • MySQL:
    • Divided database and table (split basis, time/ID), practice (proxy/application layer, corresponding to middleware);
    • Common MySQL locks, scenarios (X/S locks, Record/table locks, IS/IX locks, GapLock);
    • Isolation level, MVCC implementation;

The last

  • The interview is a test of my technical system, knowledge reserve, thinking and communication, adaptability and so on. There will be a lot of difficulties in the process. Timely summary and continuous improvement will be possible.
  • Find your own purpose, constantly give yourself the motivation to move forward;

In the limited youth, do meaningful things, do let oneself grow, valuable to the future – blank woman

It’s crowded at the bottom. I’ll meet you at the top.