Motivation
The books that I read in the past didn’t explain what a dataframe meant.
Definition
- Dataframe
- A table of data in which the values of each observed variable is contained in the same column.
Counterexample
I’ve difficulty in reading long lines of text like the above definition, so let’s illustrate this definition with a counterexample.
We have carried out repeated experiments with four types of things and obtaine some data. (Say, poured some liquid into an empty cup and take the temperature.)
Coke | Juice | Soup | Beer |
---|---|---|---|
7.5 | 1.5 | 6.2 | 9.8 |
9.4 | 1.7 | 3.6 | 5.8 |
2.4 | 2.5 | 3.7 | 8.9 |
That’s not a dataframe because the temperatures (observed variable) are put into four columns.
Example
Let’s convert the above table into a dataframe so as to understand the definition.
Liquid | Temperature |
---|---|
Coke | 7.5 |
Juice | 1.5 |
Soup | 6.2 |
Beer | 9.8 |
… | … |
Beer | 8.9 |