Don't feel like reading a dissertation?
Skip to the front of the line!
1. Introducing DFOTM
Suppose that you've got two versions of the same file or document. Suppose that they differ enough from each other that one of them would almost certainly constitute a separate work of its own, even if this weren't legally the case. Suppose that you wish to communicate this altered version of the file, but only to those few who are already in possession of the original version ― and to no one else. If this roughly describes your situation, then we've got just the tool for you.
Meet
DFOTM or the
Differential File Overlay utility.
DFOTM allows its users to securely communicate the modified contents of a file or a batch of files to another party in possession of the tool, with the added requirement that such a party must also possess the original version of said file or files.
Note
While the tool does handle encryption and decryption, it does not provide any functionality to actually transmit or receive the resulting output files over a computer network. Those actions must be performed separately by the user.
2. The Method
How does it all work? As the tool's complete name suggests,
DFOTM computes a mathematical differential between two versions of a given file. What this means is that the resulting differential output contains the data of neither version, but rather a scrambled emalgamation of the data ― one that should be close to impossible to unscramble. Provided that a sufficient degree of mutual distinction exists between the two versions of your document, a potential attacker or snoop would be statistically better off
guessing their contents rather than successfully deciphering them.
This method is as simple and effective as a
crowbar ― a tool against which no defense exists to this day. While such an approach might seem unnecessarily redundant in the age of strong encryption, it nonetheless presents an advantage that sets it firmly apart from the rest. Whereas most popular encryption methods force a sender to
infer that the intended recipient is an authorized party by virtue of possessing the correct decryption key, the differential method employed by
DFOTM turns such an inference into a
guarantee ― there simply is no way to decipher the encrypted files
unless their original versions are also available to the deciphering party. To put it more bluntly, if you do not have the originals, then you are not seeing the revisions, either ― no matter who you are.
3. Using DFOTM
The tool is split up into two parts: the
encoder and
decoder. To encrypt a file or a batch of files, the user runs the
encoder over them alongside identically named
key files that the tool can use to differentiate against. These
key files are nothing more than the
original versions of the files whose
modified versions the user wishes to encrypt.
Output files are then produced by the tool in a format that can only be understood by its counterpart, the
decoder.
In the following example, we'll be encrypting a set of three modified files, each of which comes with an unaltered earlier version of that file. Here's what the command and its resulting output would look like.
C:\DFO> dfo .. originals encoded "Operating Agreement.odt" "Quarterly Report.pdf" Minutes.mp3
Differential File Overlay Creation Utility v1.0
Copyright (C) 2023 by Open Source Laboratories, Ltd.
All rights reserved.
Operating Agreement.odt -> file_0.dfo at 75.70% / 100.00% [CRC32: 0x50fd92b1]
Quarterly Report.pdf -> file_1.dfo at 75.20% / 100.00% [CRC32: 0x5e4d7a7b]
Minutes.mp3 -> file_2.dfo at 77.23% / 99.99% [CRC32: 0xe6945c40]
Files processed: 3 of 3
3.1 Command Overview
Looks scary, doesn't it? It's, actually, really quite simple. First, let's break down the command itself.
C:\DFO> dfo .. originals encoded "Operating Agreement.odt" "Quarterly Report.pdf" Minutes.mp3
...
Command Prompt
We start at the command prompt, which tells us that we're in the C:\DFO directory. The choice of directory is completely up to the user, as long as the dfo.exe executable file is located there.
...
Executable Command
This is where we specify the name of the executable command, which is dfo. Executing this command will have the effect of running the executable program file dfo.exe, which is the encoder utility.
...
Directories
Next, we specify the directories where the encoder will be able to find the files it needs in order to proceed. The user must always specify three directories, which must not overlap. The order of the directories is also important, because it determines their respective function from the tool's point of view.
- The first directory, .., contains the input files. These are the files whose contents we wish to encrypt.
- The second directory, originals, is the directory that contains the key files. These files represent the original versions of the input files ― they are the files that the intended recipient already has in their possession.
- The third and final directory, encoded, is where the encoder will place the newly encrypted .dfo files. These are the files that the user can then safely transmit to the intended recipient, without fear of their contents becoming compromised in transit.
Note
The directory names are arbitrary, while the directories themselves can be located anywhere within the file system. The directories can be nested, but they must not overlap with each other.
Warning
It is critical to completely clear the output directory between runs. If this is not done, the tool's output could collide with an existing file still in that directory. When this occurs, the tool will skip processing the file in question and continue on to the next one in the queue.
...
Input Files
Finally, we specify the files that the tool should encrypt: Operating Agreement.odt, Quarterly Report.pdf, and Minutes.mp3. There is no fixed limit to how many files can be supplied at once. The exact number of files the tool can handle will depend on the operating system and machine used. If the user manages to exceed that limit, it may be necessary to run the tool more than once.
Important
If any of the filenames contain spaces, such filenames must be enclosed in quotes. If this is not done, the tool will become confused and will attempt to operate on a fragmented filename as if each fragment represented a separate filename of its own.
3.2 Output Breakdown
Now that we've broken down the command used, let's take a look at the output, line by line.
Operating Agreement.odt -> file_0.dfo at 75.70% / 100.00% [CRC32: 0x50fd92b1]
Quarterly Report.pdf -> file_1.dfo at 75.20% / 100.00% [CRC32: 0x5e4d7a7b]
Minutes.mp3 -> file_2.dfo at 77.23% / 99.99% [CRC32: 0xe6945c40]
...
Input File
First and foremost, the tool informs us which input file is being encoded. Any information concerning the processing of that file will be displayed on the same line, as long as there is room.
...
Output File
Next on the line is the name of the output file. These files are named sequentially by the tool as it creates them. The special .dfo filename extension lets the decoder know that it is dealing with a file encoded in the DFOTM format.
Note
While the output filenames are somewhat non-descript to make it easier to serialize them, the input filenames are not discarded, but rather stored inside the output files themselves. Even if an output file is renamed, the decoder utility will still be able to open it and extract the input filename stored within it, provided that its .dfo extension has not been changed.
...
Bitwise Uniqueness Ratio
When a file is encoded using DFOTM, its contents become mathematically spliced with those of the key file and stored in its corresponding output file for later retrieval. The degree to which the output is unique with respect to the input directly reflects the strength of the resulting encryption.
This relationship can be summarized as follows:
- If the bitwise uniqueness ratio is below 5%, then the input and output are too similar to each other and the resulting encryption strength is low.
- Likewise, if the bitwise uniqueness ratio is above 95%, then the input and output are mirror opposites of each other and the resulting encryption strength is also low.
- However, if the bitwise uniqueness ratio is around 50%, then the input and output differ just enough that one cannot be mathematically inferred from the other. The resulting encryption strength in this case is high.
Tip
Do not fixate on the idea of magically hitting the 50% bitwise uniqueness ratio for all of your files. The key takeaway with this ratio is to stay away from the edges rather than to land perfectly in the center.
...
Bytewise Uniqueness Ratio
While bitwise uniqueness plays an important role in determining encryption strength, bytewise uniqueness is no less important. Unlike its counterpart, which represents how many bits have been changed in the resulting output, this parameter indicates how many bytes have been altered.
Bytewise uniqueness can be interpreted as follows:
- If the bytewise uniqueness ratio is at 0%, then the input and output are identical. If this should occur, no encryption will have effectively taken place.
- If the bytewise uniqueness ratio is at 100%, then every single byte that makes up the output will have been changed in some way, though it is unclear how many bits will have been altered within each byte.
- In most cases, the bytewise uniqueness ratio will hover above 90%. What this means is that roughly nine out of every ten bytes in the output will have been altered in some fashion, but it will not be known which specific bytes were affected or how these bytes are distributed within the output file.
Warning
Although any unchanged bytes will likely appear out of sequence within the output file, this may not always be the case. The longer a sequence of unchanged bytes, the more likely it is to be susceptible to cryptographic analysis. Avoid transmitting output files with a low bytewise uniqueness ratio to avoid partial data leaks.
...
Input File Checksum
A Cyclic Redundancy Code or CRC is the result of a mathematical computation performed on a file and constitutes a summation of all of the byte-values that make up its contents. To ensure data integrity, DFOTM will calculate a 32-bit CRC checksum on every input file that it processes. In addition to being displayed during the encoding process, this checksum will also be stored inside the .dfo file to which said input pertains.
In essence, a CRC acts as a numerical fingerprint. The purpose of such a fingerprint is two-fold:
- A checksum can help identify the contents from which the checksum was calculated.
- Additionally, a checksum can prevent or, at least, make it easier to diagnose data corruption. Because any given checksum is mathematically tied to the data from which it was calculated, an accurate checksum on a batch of data can serve as an indicator that said data has not been tampered with.
Important
While a typical checksum cannot always accurately predict data integrity, a failed checksum all but guarantees that data corruption has occurred. Always assume that a file with a bad checksum is troublesome.
4. Decrypting DFOTM Files
So, you've got yourself a batch of
.dfo files and you aren't sure what to do with them. This is where the
decoder utility comes in. As is the case with its counterpart, the user runs the
decoder over the encrypted
.dfo files alongside any
key files that the user wishes to differentiate against. If the
key files are a cryptographic match to the
.dfo files, the latter can be decrypted.
In the following example, we'll be decrypting a set of three
.dfo files which we had previously encrypted in
Using DFO, each coming with its own counterpart representing an earlier version of that file. Here's what the command and its resulting output would look like.
C:\DFO> undfo encoded originals decoded file_0.dfo file_1.dfo file_2.dfo
Differential File Overlay Extractor Utility v1.0
Copyright (C) 2023 by Open Source Laboratories, Ltd.
All rights reserved.
file_0.dfo -> Operating Agreement.odt [CRC32 OK]
file_1.dfo -> Quarterly Report.pdf [CRC32 OK]
file_2.dfo -> Minutes.mp3 [CRC32 OK]
Files processed: 3 of 3
4.1 Command Overview
While the screen output of the
decoder utility is considerably less complex than that of the
encoder, it is still worth breaking down. First, let's take a look at the command itself.
C:\DFO> undfo encoded originals decoded file_0.dfo file_1.dfo file_2.dfo
...
Command Prompt
Like before, we start at the command prompt, which tells us that we're in the C:\DFO directory. The choice of directory is completely up to the user, as long as the undfo.exe executable file is located there.
...
Executable Command
This is where we specify the name of the executable command, which is undfo. Executing this command will have the effect of running the executable program file undfo.exe, which is the decoder utility.
...
Directories
Next, we specify the directories where the decoder will be able to find the files it needs in order to proceed. The user must always specify three directories, which must not overlap. The order of the directories is also important, because it determines their respective function from the tool's point of view.
- The first directory, encoded, contains the .dfo files. These are the files whose contents the tool will attempt to decrypt.
- The second directory, originals, is the directory that contains the key files. These files represent the original versions of the documents, which both the sender and recipient possess. The tool will differentiate the .dfo files against them so that decryption can take place.
- The third and final directory, decoded, is where the decoder will place the newly decrypted input files which we had previously encrypted in Using DFO.
Note
The directory names are arbitrary, while the directories themselves can be located anywhere within the file system. The directories can be nested, but they must not overlap with each other.
Warning
It is critical to completely clear the output directory between runs. If this is not done, the tool's output could collide with an existing file still in that directory. When this occurs, the tool will skip processing the file in question and continue on to the next one in the queue.
...
Encrypted DFO Files
These are the files that the tool will attempt to decrypt: file_0.dfo, file_1.dfo, and file_2.dfo. There is no fixed limit to how many files can be specified at once. The exact number of files the tool can handle will depend on the operating system and machine used. If the user manages to exceed that limit, it may be necessary to run the tool more than once.
4.2 Output Breakdown
Now that we've broken down the command used, let us once again take a look at the output, line by line.
file_0.dfo -> Operating Agreement.odt [CRC32 OK]
file_1.dfo -> Quarterly Report.pdf [CRC32 OK]
file_2.dfo -> Minutes.mp3 [CRC32 OK]
...
Encrypted DFO File
Once again, the tool informs us which particular file is being processed. This time, however, the file in question is an encrypted .dfo file that we'd like for the tool to decrypt for us.
...
Decrypted Input File
Next on the line is the name of the freshly decrypted input file which had been encrypted in Using DFO.
Important
When the decoder processes a .dfo file, it extracts the name of the input file contained within. If the tool cannot then find a key file with a filename that is an exact match, decryption cannot take place and the file will be skipped.
...
Decrypted File Checksum
As discussed previously, a CRC checksum is calculated from the input file during the encryption process in order to assess file integrity upon decryption. This 32-bit checksum is stored inside the encrypted .dfo file. Once the decoder utility has fully decrypted the .dfo file and extracted the payload, its checksum is compared to the one stored in the file.
If the two should match, then the decryption process is deemed to have succeeded. However, if the CRC contained in the file does not match the one calculated from the newly reconstituted input file, decryption is practically guaranteed to have failed. Regardless of outcome, an appropriate message is displayed for each file processed.
5. Frequently Asked Questions
How secure is DFOTM?
That depends entirely on the key file used. Generally speaking, the less mathematically-connected a given input file file is to its respective key file, the stronger the resulting encryption will be.
When are two files mathematically connected?
Two files are said to be mathematically connected if one of the following is true:
- The files are too similar to each other. It is essential that any input file you attempt to encrypt with DFOTM be reasonably dissimilar from the key file that will be used to encrypt it. A high bitwise uniqueness ratio could serve as an indicator that the key file in question is too similar to the input file it was used to encrypt.
- The files are too different from each other. In this case, DFOTM is likely to report a low bitwise uniqueness ratio. If this should occur, the resulting output file is all but identical to the input file from which it was produced, with little to no encryption having effectively taken place. Avoid this scenario at all costs.
What are the minimum system requirements to run DFOTM?
At a bare minimum, DFOTM requires a personal computer powered by an Intel® Pentium ProTM microprocessor running Microsoft® Windows 98TM or higher.
Can DFOTM execute on a more modern system?
As a computationally intensive software application, DFOTM benefits greatly from a faster microprocessor. The general rule of thumb is: The faster the CPU, the faster DFOTM can run. However, the speed of the microprocessor in no way impacts the accuracy of any calculations performed by DFOTM or the strength of encryption it provides.
What is the latest operating system supported by DFOTM?
DFOTM has been shown to run on Microsoft® Windows 10TM 22H2, but will also run on any Microsoft® WindowsTM operating system as far back as Windows 98TM.
Which other operating systems does DFOTM support?
In addition to Microsoft® WindowsTM, DFOTM can run natively on any qualifying Linux® computer that has glibc v2.11 or later installed. DFOTM can also run natively on FreeBSD® through its Linux® ABI compatibility layer.
Which CPU architectures does DFOTM support?
DFOTM will run on any x86-based 32-bit or 64-bit microprocessor ranging from the Intel® Pentium ProTM and up. Non-x86 RISC-based microarchitectures are currently unsupported, although this may change in the future.
I have come into possession of some .dfo files. Do I have to purchase DFOTM to decrypt them?
No, you do not require a commercial licence to DFOTM to decrypt .dfo files. Because DFOTM is split up into two separate utilities, separate licensing terms apply. The utility you need in order to decrypt a .dfo file can be obtained free of charge, subject to a freeware-style licence that can be found on this site.
Can I distribute the DFOTM software suite?
Distribution of any part of the DFOTM software suite falls under the following rules:
- To distribute the encoder utility, you must negotiate a distribution agreement with us. Any distribution of the encoder outside the terms of such an agreement shall constitute copyright infringement and is prohibited by law. On Microsoft® WindowsTM, the encoder is represented by the dfo.exe executable file and its associated material. On Linux®, the executable filename is simply dfo, without an extension.
- To distribute the decoder utility, you must agree to a freeware-style licence that governs its distribution. Neither the decoder utility nor the licence text may be changed by you in any way. On Microsoft® WindowsTM, the decoder is represented by the undfo.exe executable file and its associated material. On Linux®, the executable filename is simply undfo, without an extension.
6. Download DFOTM
Interested? Then head on over to our
downloads page and introduce a new layer of security into your digital life!