When you show standard deviations in the error bars you can draw conclusions about the spread of the data.
When you show confidence intervals you can draw conclusions about differences between the groups. So when you want to show the results of a statistical test (= comparison) you typically show 95% CI in the error bars.
So to decide what to use in error bars you have to think about the message you want to convey: do you want to show the data or do you want to compare the groups?
A good indication that you want to compare groups is the fact that you want to add p values to the plot. Other indications are expressions like “higher than”, “increase in”, “compared to”, “consistently lower than” when you describe the content of the plot.
The 95% CI shows the interval that contains (with 95% certainty) the mean of the population. Most parametric statistical tests will compare the means of the groups to draw conclusions about the means of the populations.
Using the 95% CI in the error bars has a nice feature:
When you use sem in the error bars you cannot use this nice feature. It’s doesn’t allow to draw conclusions about the difference between the groups just based on the error bars alone.
So it’s not wrong to use sem, it’s just that 95% CIs are more informative.
I prefer to plot the log transformed values because I’m a fan of always showing what was used in the statistical test on the plot (hence the log transformed values).
Some journals will ask you to back transform the data and the statistics. Note that back transforming statistics like the mean is tricky since:
If I do back transform, I show the geometric mean of the original data on the plot because:
The geometric mean is easy to calculate in Prism, harder in R but there are packages like emmeans with a function for back transformation.
The geometric mean is equal to the result of performing the inverse function (10 to the power in case of a log 10 transformation) on the mean. In this way you can perform the inverse function on all statistics, so also on the 95% CI. However, it is unclear what to call the result: it’s not really the CI of the original mean.
Among statisticians there’s a lot of discussion on this as you can see here.
After ANOVA or Kruskal Wallis you have one or multiple p values generated by the ANOVA itself and multiple p values generated by the post hoc tests. On the graph you show the p values of the post hoc tests.
It’s not mandatory. If you do then use the actual p values and not *.
If there are many comparisons you are allowed to only show the significant p values. If you do so, mention in the legend that you did all comparisons but you only show the ones with a p value < 0.05
Ideally, you make a table with the results of all comparisons showing:
A p value alone is not very informative, the difference and the 95% CI will allow biological interpretation of the results.
Below you can see an example of such a table (you don’t have to include the column with T values):