introduce
- Uber Mode can simply be understood as JVM reuse, which was introduced in hadoop2.x; When MapReduce jobs are run in Uber mode, all Map Tasks and Reduce Tasks will be run in the container where the ApplicationMaster resides. In other words, during the whole MapReduce operation, only AM Container will be started. Since mapper and Reducer Containers do not need to be started, AM does not need to communicate with remote containers, and the whole process is simple.
- However, not all MapReduce jobs can be enabled in Uber mode. If the data input of MapReduce jobs is very small, it takes longer to start Map Containers or Reduce Containers than to process data. Then this job can be considered to enable Uber Mode. In general, enabling Uber Mode for small jobs will result in a 2X-3X performance improvement.
- In general, Uber Mode can save a lot of time for small data operations.
Uber mode is enabled
-
Related codes:
isUber = uberEnabled && smallNumMapTasks && smallNumReduceTasks && smallInput && smallMemory && smallCpu && notChainJob && isValidUberMaxReduces;
-
Parameter interpretation:
-
UberEnabled: graphs. Job. Ubertask. Enable the value of the parameter, to false by default; That is, Uber mode is not enabled by default;
-
SmallNumMapTasks: enable Uber Map mode homework must be less than or equal to the number of graphs. Job. Ubertask. Maxmaps parameter values, the value is the default value is 9; That is, by default, if you want to enable Uber Mode, the number of Map jobs must be less than 10.
-
SmallNumReduceTasks: Uber mode of operation number must be less than or equal to the Reduce of graphs. Job. Ubertask. Maxreduces, this value defaults to 1; That is, by default, if Uber mode is to be enabled, the number of Reduce jobs must be less than 2.
-
SmallInput: not any homework for enabling Uber mode, the input data must be no larger than the size of the graphs. Job. Ubertask. Maxbytes the value of the parameter, the default is a file HDFS block size;
-
SmallMemory: Because the job is running in the container where the AM is located, So requirements set Map memory (graphs. The Map. The memory. MB) and Reduce memory (graphs. Reduce. The memory. MB) cannot exceed AM place container memory size Settings (yarn. The app. Graphs. AM. Resourc E.m b);
-
SmallCpu: Map configuration vcores (graphs. Map. CPU. Vcores) number and Reduce configuration vcores (graphs. Reduce. CPU. Vcores) number also cannot exceed vcores AM in container number Settings (yarn. App. M Apreduce. Am. Resource. CPU – vcores);
-
NotChainJob: In addition, process the data of the Map class (graphs. Job. Map. The class) and Reduce the class (graphs. Job. Reduce. Class) must not be ChainMapper or ChainReducer only;
-
IsValidUberMaxReduces: Currently, Uber Mode can be enabled only when the number of Reduce tasks is less than 2.
-