## Mistakes I Have Made

This page is dedicated to the various mistakes that I have made using Stata. Learn from them! Have you made a mistake using Stata that you think others could learn from? Email me at djiboliz@gmail.com.

### expecting decimal operations to be exact

Stata (being a computer program) stores numbers in binary. There is no exact binary representation of decimals such as 0.1. Stata's default for storing
numbers is the float which has 7 digits of accuracy. It does calculations and comparisions to double accuracy, which is 16 digits. This means you can get
the following

. gen x = 0.1
. count if x == 0.1
0

To avoid this issue, you could store your numbers with decimal points as doubles.

. gen double n = 0.1
. count if n == 0.1
1

But this uses twice as much space to store, and you can't use this method if you are using variables that have been entered or created by someone else.
Instead, use the float function when doing comparaisons, which rounds numbers to float accuracy.

. count if float(x) == float(0.1)
1

### using `if varx` instead of `if varx == 1`

`if varx` evaluates to TRUE if *varx* is missing, so if *varx* is a binary
variable, your command will be executed for observations where *varx* is equal to 1 and
where it's missing. This is probably not what you want.

### using `if varx != .` instead of `if varx < . `

Later versions of Stata support extended missing values
such as `.a` and `.b` Some datasets use these extended missing values to differentiate between
different reasons why the value is missing (for example .a might mean "don't know" while .b means "not applicable.") All of these extended missing values are counted by Stata as being greater than any number so if you want to restrict your command to observations where *varx* is not missing, use `if varx < . ` to be safe. Similarily, if you want to
drop all observations where *varx* is missing, use `drop if varx >= .` instead of ` drop if varx == .`