Static analysis is still based on Mach-O files. By default, you already know the concepts related to Mach-O. This article mainly describes some problems encountered in the practical process.
Problems encountered in Mach-O static analysis
In general logic we consider the unused classes in Mach-O to be: unused_class = __objc_classlist – __objc_classrefs. In practice this results in some deviations. The following cases are recorded in the __objc_classrefs section:
1, +load method
This class overrides the +load method to implement some logic.
__DATA __objc_nlclslist contains information about a class that implements the +load method.
unused_class = __objc_classlist – __objc_classrefs – __objc_nlclslist
def get_imp_load_class_points(path, binary_file_arch) :
list_pointers = set()
lines = os.popen("/usr/bin/otool -v -s __DATA __objc_nlclslist %s" % path).readlines()
for line in lines:
pointers = pointers_from_binary(line, binary_file_arch)
if not pointers:
continue
list_pointers = list_pointers.union(pointers)
if len(list_pointers) == 0:
return None
return list_pointers
Copy the code
2. Dynamically invoked classes
OC is a dynamic language that dynamically instantiates a class from a string. For example, a hard-coded string route registration is eventually called with NSClassFromString(@”ClassName”), and the xib view is initialized with the ClassName.
def get_cstring_list(path) :
cstring_list = set()
re_cstring = re.compile("(\w{16}) (.+)")
lines = os.popen("/usr/bin/otool -v -s __TEXT __cstring %s" % path).readlines()
for line in lines:
result = re_cstring.findall(line)
if result:
(address, cstring) = result[0]
cstring = cstring.lstrip()
if len(cstring):
cstring_list.add(cstring)
if len(cstring_list) == 0:
return None
return cstring_list
Copy the code
Solution: The __TEXT __cstring section contains the C/OC string constants defined in the code. If the C/OC string constants are in unused_class, Thinking it might be called dynamically, we leave this class in place. This part may be relatively time-consuming, after all, the traversal of the string of the entire project can be optimized to a certain extent according to the actual time
3. Use as an attribute/member variable only
As attributes and member variables can be considered the same class, attributes automatically generate code for member variables
4. As base class
When a class is used as a base class, its subclasses are actually used. The base class itself is not used and this time is not recorded. The network request class, for example, initiates the request through subclass inheritance
To solve the 3 & 4: The otool -ov command can obtain the Mach-O details, read the content line by line to obtain the information of a class, including the inheritance relationship, metaclass, method, attribute, member variables, etc., assuming that the parent class is in unUSed_class and the subclass is not in unUSed_class, it is considered that the parent class is also used
def filter_super_class_and_ivars(unref_class_list) :
re_subclass_name = re.compile("\w{16} 0x\w{9} _OBJC_CLASS_\$_(.+)") # subclass
re_superclass_name = re.compile("\s*superclass 0x\w{9} _OBJC_CLASS_\$_(.+)") # the parent class
re_ivars_type_name = re.compile(" {12}type {6}0x\w{9} @\"(.+)\"") # member variables
lines = os.popen("/usr/bin/otool -ov %s" % path).readlines()
subclass_name = ""
superclass_name = ""
for line in lines:
subclass_match_result = re_subclass_name.findall(line)
if subclass_match_result:
subclass_name = subclass_match_result[0]
superclass_match_result = re_superclass_name.findall(line)
if superclass_match_result:
superclass_name = superclass_match_result[0]
ivars_type_name_result = re_ivars_type_name.findall(line)
if ivars_type_name_result:
ivars_type_name = ivars_type_name_result[0]
if ivars_type_name in unref_class_list:
unref_class_list.remove(ivars_type_name)
if len(subclass_name) > 0 and len(superclass_name) > 0:
if superclass_name in unref_class_list and subclass_name not in unref_class_list:
unref_syunref_class_listmbols.remove(superclass_name)
superclass_name = ""
subclass_name = ""
return unref_class_list
Copy the code
Ref_class = ref_class; ref_class = ref_class; ref_class = ref_class;
The answer is yes, this case requires multiple analyses to find such an unusED_class
How to classify in component engineering
It was a little frustrating when the analysis revealed that the unused classes were a bunch of unordered class names, which all belong to different components, so there was a way to categorize them by component so that each business side could claim them.
classified
Pod::Installer gets all Pod tagets and the corresponding source_build_phase from Xcode, Compile Sources. The disadvantage of matching unused classes is that some defined inner classes will not find the corresponding component. Implement the following in podfile:
post_install do |installer|
installer.pods_project.targets.each do |target|
if! target.name.include? ('Pods-')
if! target.instance_of? Xcodeproj::Project::Object::PBXAggregateTarget puts("-pod name:" + target.name + "\n")
source_files = target.source_build_phase.files
source_files.each do |file|
puts("--class name:" + file.display_name + "\n")
end
puts("\n")
end
end
end
end
Copy the code
Remove the reference
You can also use scripts to remove these classes for quick verification, github.com/CocoaPods/X… Build Phases->Compile Sources, Header, and Build Phases->Compile Sources in Xcode. Note also that the reference to xxx-umbrella. H is a header file generated during the Pod Install phase. Removing the class reference will not affect this file, and you will need to remove the reference to this header file later.
target.source_build_phase.remove_file_reference(file.file_ref)
target.headers_build_phase.remove_file_reference(file.file_ref)
installer.pods_project.targets.each do |target|
if! target.name.include? ('Pods-')
if! target.instance_of? Xcodeproj::Project::Object::PBXAggregateTarget unused_classes = unused_class_hash[target.name]ifunused_classes ! =nil
puts ("Target: #{target.name}".green)
source_files = target.source_build_phase.files
source_files.each do |file|
file_name = file.display_name[0..(file.display_name.length - 3)] # remove suffix
if unused_classes.include? (file_name)# remove source
target.source_build_phase.remove_file_reference(file.file_ref)
target.source_build_phase.remove_build_file(file)
puts ("-remove source reference: #{file_name}".red)
# delete a reference to xxx-umbrella. H
remove_umbrella_headers(target.name, file_name)
# Remove references in the code and only traverse the component where the class itself resides
delete_code_reference("./Pods/#{target.name}", file_name)
end
end
headers_files = target.headers_build_phase.files
headers_files.each do |file|
file_name = file.display_name[0..(file.display_name.length - 3)] # remove suffix
if unused_classes.include? (file_name)# remove headers
target.headers_build_phase.remove_file_reference(file.file_ref)
target.headers_build_phase.remove_build_file(file)
end
end
end
end
end
end
end
Copy the code
At the end
Of course, the final confirmation of whether this class is removed or not will require careful verification code, the above situation should be largely covered and resolved unused_class = __objc_classlist – __objc_classrefs error.