Once your hypothesis is defined and your model is
formulated, you should do one more exercise before you commence
collecting data. Draw a
picture.
Put pen on
paper and sketch the ideal graphic(s) you would expect to see in your
final paper. It might be a histogram a boxplot, a scatterplot, a
regression surface, whatever. Just draw it. Label your
axes, give it a title, think about the points. If you can do
this, then you're ready to go.
For example, your hypothesis might be that the ERA
of a pitching staff when the primary catcher is behind the plate is
equivalent to the ERA when the back-ups are in the game. You
might sketch something like this:
I made that graph with SPSS. Your hand-drawn
picture might just be the X and Y axes with a few sample points and a
line that
fits them. But in either case, it shows that this is a testable
hypothesis.
If your hypothesis involves a straight comparison
of two means, you might sketch a boxplot to represent what your data
will (hopefully) look like. Suppose you were interested in
assessing the difference between mean ERA in 1968 and 2000:
(That graph also comes from SPSS 12.0. Those
are actual data, showing the outstanding performances of Gibson and
Pedro relative to the years in which they pitched. We use this
plot in our sabermetrics course to show that Petey's 2000 performance
is just as impressive as Gibson's microscopic 1968 ERA.)
Finally, suppose your hypothesis was that slugging
percentage has increased over the past 15 years. Your sketch
might look like this:
Again, a simple sketch will do. If you can label your
axes, define your X and Y variables, and think about which means/slopes/areas
your analysis will compare, then you ready to start collecting data. Go
for it.