## Insight Things

### A scientific blog revealing the hidden links which shape our world

#### Category: Statistics

An often forgotten formula for the mean of a random variable $X$ is given by:

$\displaystyle \mu=E(X)=\sum_{x=0}^{\infty} \Big(1-F(x)\Big)$

And for the continous case:

$\displaystyle \mu=E(X)=\int_{x=0}^{\infty} \Big(1-F(x)\Big)$

This blog post is going to illustrate how these formulas arise.

more →

Those of you who already have tested for the variance of data from a normal distribution may have asked themselves how the link between normal variance and chi-squared distribution arises. Trust me: The story, which I will tell you, is an exciting one! more →

It was really surprising for me when I thought about this kind of operation, namely division by arithmetic means and expected values. People tend to work with means and expected values very intuitively. You can add and multiply them without any issues. Dividing on the other hand can be misleading and I am going to illustrate this with some neat examples.

more →

I guess that you already know a little bit about hypothesis testing. For instance, you might have carried out tests in which you tried to reject the hypothesis that your sample comes from a population with a hypthesized mean µ. As you know that the sample mean follows a t-distribution (or the normal distribution in case of huge samples), you can define a rejection region based on a specific significance level α. In a one-sided test, sample means which are less than a critical value CV might be considered to be rather unlikely. If the obtained sample’s mean falls into this region, the hypothesis gets rejected at this particular signficance level α.

Distribution of sample mean and rejection region for one-sided test

We know the chance of rejecting the hypothesis although it’s true (Type I error), because it is the chance of obtaining just one of those values from the rejection region (plotted red). But how likely are we to reject a hypothesis if it’s indeed false? In other words: How small is the type II error? more →

Quick question: Is BASF’s day-to-day rate of stock returns (shown below) distributed normally?

Day-to-day rates of return of BASF stock

Yes? In theory it isn’t! In theory rates of stock returns follow a lognormal distribution as I have shown in “On the distribution of stock return”. Unfortunately, most people don’t take the lognormal distribution serious, although it is very often at work! This article shows how the lognormal distribution arises and why its shape sometimes mistaken for a normal distribution. more →

One of statistic’s foundations lies in the fact you can add variances. Maybe you wonder a little bit, because the formula for the variance does not look like that at first glance. This article will show you the proof why and under which circumstances adding variances is a valid practice. Please check the information given in my articles on addition and multiplication of expected values, if you do not have collected experiences with it yet. more →

Maybe you had to multiply means or expected values already. If you know, for instance, how often people go shopping on average and how much money they spent on their shopping tours on average then you could multiply both to obtain the average amount spent. In this post I will explain why multiplying means and expected values is a valid operation. more →

In applied statistics you often have to combine data from different samples or distributions. One of the most frequently used operation here is to add means and expected values. For instance, you could sample people’s leg length, body and head height. The result when adding these means? It is the average body height, I hope! more →