Statistics Archives | Insight Things

Category: Statistics

Alternative formula for the mean

An often forgotten formula for the mean of a random variable $X$ is given by:

$\displaystyle \mu=E(X)=\sum_{x=0}^{\infty} \Big(1-F(x)\Big)$

And for the continous case:

$\displaystyle \mu=E(X)=\int_{x=0}^{\infty} \Big(1-F(x)\Big)$

This blog post is going to illustrate how these formulas arise.

Chi-squared distribution and normal variance

July 26, 2019 / Jan Rothkegel / 0 Comments

Those of you who already have tested for the variance of data from a normal distribution may have asked themselves how the link between normal variance and chi-squared distribution arises. Trust me: The story, which I will tell you, is an exciting one! more →

Trapped: Division by means and expected values

June 25, 2016 / Jan Rothkegel / 1 Comment

It was really surprising for me when I thought about this kind of operation, namely division by arithmetic means and expected values. People tend to work with means and expected values very intuitively. You can add and multiply them without any issues. Dividing on the other hand can be misleading and I am going to illustrate this with some neat examples.

Power of Statistical Tests – The Principles

April 25, 2016 / Jan Rothkegel / 3 Comments

I guess that you already know a little bit about hypothesis testing. For instance, you might have carried out tests in which you tried to reject the hypothesis that your sample comes from a population with a hypthesized mean µ. As you know that the sample mean follows a t-distribution (or the normal distribution in case of huge samples), you can define a rejection region based on a specific significance level α. In a one-sided test, sample means which are less than a critical value CV might be considered to be rather unlikely. If the obtained sample’s mean falls into this region, the hypothesis gets rejected at this particular signficance level α.

Distribution of sample mean and rejection region for one-sided test

We know the chance of rejecting the hypothesis although it’s true (Type I error), because it is the chance of obtaining just one of those values from the rejection region (plotted red). But how likely are we to reject a hypothesis if it’s indeed false? In other words: How small is the type II error? more →

Log-normal distribution mistaken for normal

March 8, 2016 / Jan Rothkegel / 2 Comments

Quick question: Is BASF’s day-to-day rate of stock returns (shown below) distributed normally?

Day-to-day rates of return of BASF stock

Yes? In theory it isn’t! In theory rates of stock returns follow a lognormal distribution as I have shown in “On the distribution of stock return”. Unfortunately, most people don’t take the lognormal distribution serious, although it is very often at work! This article shows how the lognormal distribution arises and why its shape sometimes mistaken for a normal distribution. more →

Why you can add variances

March 4, 2016 / Jan Rothkegel / 0 Comments

One of statistic’s foundations lies in the fact you can add variances. Maybe you wonder a little bit, because the formula for the variance does not look like that at first glance. This article will show you the proof why and under which circumstances adding variances is a valid practice. Please check the information given in my articles on addition and multiplication of expected values, if you do not have collected experiences with it yet. more →

Why you can multiply means and expected values

February 28, 2016 / Jan Rothkegel / 0 Comments

Maybe you had to multiply means or expected values already. If you know, for instance, how often people go shopping on average and how much money they spent on their shopping tours on average then you could multiply both to obtain the average amount spent. In this post I will explain why multiplying means and expected values is a valid operation. more →

Why you can add means and expected values

February 27, 2016 / Jan Rothkegel / 0 Comments

In applied statistics you often have to combine data from different samples or distributions. One of the most frequently used operation here is to add means and expected values. For instance, you could sample people’s leg length, body and head height. The result when adding these means? It is the average body height, I hope! more →