Distância de Hamming
- Created by
- Renato Passos, Eng. de Software
- Reviewed by
- Renato Passos, Eng. de Software
Last updated: Apr 18, 2026
About this calculator
The Hamming distance is a measure used to calculate the dissimilarity between two character sequences, such as DNA, RNA, or proteins. It works by counting the number of positions where the sequences differ. For example, if we have two DNA sequences 'ATCG' and 'ATGG', the Hamming distance between them is 1, since only one position differs. This measure is useful in analyses of similarity and dissimilarity in molecular biology.
The formula for calculating the Hamming distance is simple: it is the number of differences between the sequences divided by the length of the sequences. However, it is essential to note that this measure assumes that the sequences have the same length and that the differences are the result of single character substitutions. The Hamming distance is often used in phylogenetic studies to reconstruct evolutionary trees and understand the relationship between different species.
When to use the Hamming distance? In molecular evolution studies, when comparing gene or protein sequences from different organisms, this measure can help identify the evolutionary distance between them. However, it is crucial to consider the limitations, such as the possibility of multiple substitutions at the same site, which can underestimate the actual distance.
A common caution when using the Hamming distance is to assume that it directly reflects evolutionary distance. However, factors such as variable mutation rates over time and the occurrence of recombination events can influence the relationship between Hamming distance and evolutionary time. Therefore, it is essential to consider these factors when interpreting the results.
Frequently asked questions
What is Hamming distance?
It is a measure of dissimilarity between two character sequences, such as DNA or proteins, counting the number of positions where they differ.
How to calculate Hamming distance?
It is the number of differences between sequences divided by the length of the sequences.
When to use Hamming distance?
In molecular evolution studies, to compare gene or protein sequences from different organisms and identify the evolutionary distance between them.
What are the limitations of Hamming distance?
It assumes that sequences have the same length and that differences are the result of single character substitutions. It may underestimate the actual distance if multiple substitutions occur at the same site.
How to interpret Hamming distance results?
Considering factors such as variable mutation rates and recombination events, which can influence the relationship between Hamming distance and evolutionary time.