DiskSorter is optimized for modern multi-core and multi-CPU systems and is capable of classifying files stored on multiple disks,
directories or network shares in parallel using all CPUs installed in the computer. DiskSorter provides a number of different performance
optimization options allowing one to tune the file classification operations for user-specific hardware and storage configurations.
In order to customize the file classification performance optimization options, open the file classification operation dialog,
press the 'Options' button and select the 'Advanced' tab. The 'File Classification Threads' option controls how many parallel
threads are used to classify files. The 'Directories Scanning Threads' option controls how many parallel threads are used to scan
input disks, directories and network shares. In the 'Fault-Tolerant' directory scanning mode, DiskSorter uses an individual
processing thread for each input disk, directory or network share, but limits the maximum number of parallel scanning threads
to the specified value. In the high-performance directory scanning mode, DiskSorter always uses the specified number of parallel
directory scanning threads even when processing a single input disk, directory or network share.
For example, when classifying files stored on a high-speed NVMe SSD disk, DiskSorter reaches up to 63,000 files/sec using a single
file classification thread. With two parallel file classification threads, the performance scales up to 98,000 files/sec and with
four parallel file classification threads, the performance increases up to 123,000 files/sec showing a very good multi-threaded
performance scalability. With six processing threads the file classification performance reaches up to 127,000 files/sec and with
eight processing threads the performance increases up to 129,000 files/sec, which makes it possible to classify and categorize
10 millions files within two minutes.
When classifying files stored on regular SATA SSD drives, which are significantly slower than NVMe SSD drives, the performance
of the file classification process reaches up to 24,000 files/sec using a single file classification thread and scales up to
69,000 files/sec with four parallel file classification threads. With eight parallel file classification threads, the performance
reaches up to 98,000 files/sec, which allows to classify and categorize huge numbers of files relatively fast.
Classifying files stored on a NAS storage device via a network is more complicated because the user needs to take into account
the speed and the latency of the network. If the computer, on which DiskSorter is installed, is connected to the NAS storage
device via a high-speed, low-latency network, the performance of the file classification operations may reach up to 12,000
files/sec with one file classification thread, scale up to 49,000 files/sec with four parallel file classification threads
and increase up to 82,000 files/sec with eight parallel file classification threads.
On the other hand, if DiskSorter will need to access network shares via the Internet or via a long-distance, high-latency network,
the performance of the file classification operations will be relatively slow. One of the options to increase the performance
of the file classification operations in such configurations is to set the 'High-Performance' directory scanning mode and increase
the number of parallel directory scanning threads to 16 or even 32 disregarding how many CPUs are actually installed on the computer.
Classifying millions of files stored in one or more large NAS servers may be a very time consuming operation and one of the ways to
speed-up the file classification process is to perform the file classification operation via the 2.5 Gigabit Ethernet network.
With 2.5 Gigabit Ethernet the performance of the DiskSorter file classification operations continues to scale up to 80,000 Files/Sec
with 8 parallel file classification threads.
Due to a very wide adoption of laptops and NAS servers with built-in WiFi network interfaces, many users may consider classifying files
stored in NAS servers via the wireless network. But, the latency of the wireless network is much higher and therefore it will take much
more time to complete the file classification operation via the wireless network. The question is how much longer the user will need to
wait and if it will save any significant amount of time to perform the file classification operation via a wired network.
Based on our benchmarks, via a 5 GHz wireless network, DiskSorter reaches up to 1,500 Files/Sec with a single file classification thread
and scales up to 12,000 Files/Sec with 8 parallel file classification threads, which is approximately 5 times slower compared to the
standard Gigabit Ethernet and 7 times slower when compared to the 2.5 Gigabit Ethernet. So, if the user needs to classify up to 1 million
files, it is possible to use a wireless network. On the other hand, if the user needs to classify a large NAS server with many millions
of files, a low-latency Gigabit Ethernet or 2.5 Gigabit Ethernet is required.
Modern USB flash drives provide plenty of the storage space and are reasonably fast allowing one to store vast amounts of data for
backup purposes. Sometimes, it may be required to classify and categorize files on a USB flash drive in order to free the used disk
space. When classifying files stored on a USB flash drive, DiskSorter can reach up to 15,000 files/sec with a single file classification
thread. With two parallel file classification threads, the performance increases up to 18,000 files/sec, which is good, but more than two
file classification threads will slightly degrade the performance of the file classification operations.
Today, modern IT environments widely deploy virtual servers and/or virtual workstations. Most of the popular virtualization platforms
provide a high level of performance, but anyway some performance degradations are inevitable when a file classification operation is
executed on a guest virtual machine compared to the same file classification operation executed directly on the host computer.
For example, when a virtual machine with 4 virtual CPUs is stored on an NVMe SSD disk and classifying files stored on a virtual local
disk drive, which is physically stored on the same NVMe SSD disk, the performance of the file classification operations reaches up
to 25,000 files/sec using a single file classification thread. With two parallel file classification threads, the performance of the
file classification operations scales up to 37,000 files/sec and with four parallel file classification threads, the performance of
the file classification operations increases up to 48,000 files allowing one to classify and categorize huge numbers of files relatively fast.