Statistical Analysis Part I: Standard Deviation and Correlation

Submitted By Aaron NematNejad

In this article we examine some of the main statistical tools used in trading securities. We use examples from trading long short equity pairs and analyze statistical methods to determine the quality of the trade. The statistical concepts covered include, averages, correlation, standard deviation, mean reversion.

Although statistical analysis could be used to analyze single stocks we will mainly apply the principles in long short equity pair trades. Reasons being that correlation can only be applied when describing the behavior of more than one asset. All the examples given are by being long Superior Energy (SPN) and short Cameron International (CAM). The ratio of two stocks is simply the price of the long stock divided by the price of the short position.  A fundamental analysis is made to buy SPN and sell CAM using the methods described in the section ¡°Advanced Equity Trading¡±. However before a trade is carried out we will need to check to see if the two stocks correlate with each other and if there is a significant mean reversion return. We will start of by explaining the main statistical tools.

Click Here to Download the Spreadsheet

Mean and Standard Deviation
The mean of a variable is the simple average of the variable. It is usually denoted with a bar on top of the variable being calculated. Its is calculated by summing the values of the variables, in our case the price ratio between the two stocks and then dividing the summation by the number of data points. This is expressed as

Mean

 

[1]

Mean = ratio1 + ratio 2 + ratio 3 + ………ratio n

                       ------------------------------------------------------           

                                    Number of observations

On 7/31/07 the ratio was 0.5169. If we download 1 year worth of data we will get 252 business days worth of prices. The average will be 0.5763. Many traders will look at a continues moving average figure instead a single static number. This is simply a rolling daily average going back the number of days you want to calculate the average for. For example a 30 day moving average is simply a chart or data figure representing what the 30 day average is at each point in time.

Standard deviation is measure that identifies the spread in a set of values. It is usually denoted by a lower case sigma (¥ò). To calculate the standard deviation, the mean needs to be calculated first using [1]. Followed by this a table needs to be created to calculate the resulting value when subtracting the mean from each price ratio for the sample. The square of the difference is taken to remove negative figures and the resulting value is divided by the number of observations. This number is the variance. The square root of the variance is the standard deviation. We will replace the stock price ratio by x, and the number of observations by N, to be consistent with mainstream mathematical notations.

[2]

If there is a large spread in values, the standard deviation of the sample will be large and vice versa. If there is no change in the variables, the sample will have 0 standard deviations.

Covariance and Correlation
In our case we need to apply the principles of covariance and correlation by looking at the two stocks separately and not as a price ratio. Here we are trying to determine the relationship between two random prices.
 
Covariance is a measure of how much two variables move together. This is different to variance which measures how much a single variable moves. If two variables move together the covariance is positive. If they move in opposite directions the covariance is negative. The formula is expressed as

[3]

In our case the function that we want to calculate is Cov (SPN,CAM).
Like standard deviation, when calculating covariance the range of values is infinite. Correlation however ranges between -100% to +100% which makes it a better measure. The relationship between correlation and covariance is
simply represented by the following formula:

[4]


Substituting equation [3] into [4] we arrive at

[5]

Correlation is an extremely important tool in trading for two reasons. If a trader wants to hedge a position he has to have confidence that the two stocks generally trade in the same direction so as to protect the value of the portfolio if the markets take the wrong turn.

Examples in Excel

The attached Excel spreadsheet will allow one to see how the principles and math discussed above can be applied in practice. The aim of the model in the spreadsheet is to evaluate if the long short pair idea is profitable. Note that the columns highlighted in yellow mean are columns where the numbers are calculated by using excel functions rather than manual calculations. For example columns H and I both calculate the standard deviations, however column H is calculated by using simple Excel additions and subtractions where as column I uses =STDEVA().

We start of by downloading the price history of the Superior Energy (SPN) and Cameron International Corp (CAM) in columns C and D. Column E is the price ratio which is simply column C divided by D. Column F is the mean using formula [1]. I included 2 columns that calculate the standard deviations. The first one in column H is calculated manually using equation [2] and column I calculates it using the Excel function =STDEV(¡¦). As we can see the two numbers back each other up. Columns J to N are similar except they analyze CAM instead of SPN.

Columns M to U calculate the statistical data using a combination of SPN and SAM, whether it is the correlation or price ratio. Column N contains the manual calculation for covariance using equation 2. Column O contains the same calculation but using the excel function =COVAR(). The two resulting numbers are the same.  Column P and Q are the correlations of the two stock, P is the manual calculation and Q uses the Excel function =CORREL(..).

Bollinger Bands
Bollinger Bands are plotted on a chart to indicate a certain number of standard deviations above and below the moving average. To plot Bollinger bands one needs to calculate the standard deviation of the stock price ratio. To plot the upper Bollinger chart one has to multiply the standard deviation by 2 and add that to the average. Similar method, for the lower Bollinger band where one subtracts two standard deviations from the average. These are represented in columns T and U.

August,6, 2007



Did you like this article?

These days there are so many

These days there are so many investment tool software that can be effectively used to make investments and do trading. The thing is that since these tools are relatively new people are not sure whether these investment tools are beneficial and how to use them. If you are amongst those then there are many investor camps and seminars where you can gain more information regarding how to use and benefit from these new age auto software investment tools. Keith, Investools Scam

Statistical Analysis Part I:

Statistical Analysis Part I: Standard Deviation and Correlation :It is great invention.I like too mush. Timeshare Resales

buy prosolution pills,

buy prosolution pills, prosolution pills are the best. order prosolution pills today. prosolution pills prosolution pills prosolution pills, prosolution pills, prosolution pills great site prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills prosolution pills

Luckily for us, ed hardy

Luckily for us, ed hardy clothing is one of the few labels that never go out of fashion. ed hardy Selling ed hardy men longsleeve,ed hardy swimsuit,ed hardy men swimsuit,ed hardy boot. ed hardy clothes ed hardy Newest ed hardy,ed hardy clothing,Shirts,Swimwear Commodity New styles have just arrived.Cheapest EDHardy Sale. ed hardy store ed hardy shoes best Ed Hardy UK Online Store. We provide customer with Large of high quality and low price Ed Hardy Clothing & better .

Such interesting read and

Such interesting read and information, thanks for sharing this post, I’ve already bookmarked your blog. I can see that you are putting a lot of time and effort into your blog and detailed articles! I am deeply in love with every single piece of information you post here. Will be back often to read more updates. Research Papers - How to write an essay

great article. I remember my

great article. I remember my course of Math from Uni. Do you think these appoaches are actual today in our not quiet economy?
-----------
1 line slots | 5 reel slots

Pretty good post. I just

Pretty good post. I just stumbled upon your blog and wanted to say,It would be great if they also implemented this in schools it will be a great interesting post..Regards, Lawyers Turner

this is interesting story.

this is interesting story. It will be helpful for the student. Sarong

Great post! free games

Would also be better if the

Would also be better if the data for the 60 day mean was extended to 12 months to make a better comparison with the second year mean data. Luton van

These measures are very

These measures are very useful in finance, no doubt about it. And they can be used in other fields, as long as the data is a regular time series.
PokerStars marketing code

Its a nice article. I like

Its a nice article. I like this article. Thank you.

Credit Card Applications

Helpful blog. The

Helpful blog. The statistical concepts covered include, averages, correlation, standard deviation, mean reversion. Travel Insurance Comparison

I like this article very

I like this article very much ,You Have written a great work sheet Thank Compare Credit Cards

This site is very useful for

This site is very useful for the students who are intrested in mathes.The techniques here to solve the problems is really good. Personal Injury Claim

This formula should track

This formula should track the Excel function relatively closely as you're calculating a rolling value as opposed to a spot value. Elizabeth Bay Property Management

Can anyone give me review of

Can anyone give me review of application of any statistical tool? Thanks, auto insurance quotes

This is interesting story.

This is interesting story. It will be helpful for the student. Car Insurance Comparison

The techniques here to solve

The techniques here to solve the problems is really good.This site is very useful for the students who are intrested in mathes. Regards, jack - Buy colon cleanse

Its an nice informative

Its an nice informative information in it...I really liked the blog very much ...I enjoyed reading it ...Thanks, roy - anti-aging

great policy and the best

great policy and the best works ever, thanks for sharing this post with us. regards, tellasr free ads |employment |dual flush toilet

We have provided with very

We have provided with very good standards in this Reflective Essay. And the columns highlighted in yellow mean are columns where the numbers are calculated by using excel functions as mentioned in the Research Paper Topics rather than manual calculations. Regards, Gathrin - the author of Term Papers U.S.A, california.

heyy the formula so that all

heyy the formula so that all price values are subtracting the most recent mean value. If you're doing a rolling average, why would you subtract the latest mean? If anything, you'd be subtracting the mean of the whole range or at least the period n. Regards, bret - High Chair

The latest cel backs up the

The latest cel backs up the function in column I and O, but if I fill the handwritten formula down, the handmade formula departs controlled flight very quickly. This formula should track the Excel function relatively closely as you're calculating a rolling value as opposed to a spot value. club penguin cheats

Can anyone give me review of

Can anyone give me review of application of any statistical tool? Thanks, income protection

what are the statistical

what are the statistical tools & techniques to collect the data? Thanks, SEO

Great site nice to see . It

Great site nice to see . It would be great if they also implemented this in schools it will be a great interesting post..Regards, steve - Acne Skin care

heyy thanku for this site

heyy thanku for this site for providing great stuff..its really useful for students. Regards, pitt - free online games

This is a really interesting

This is a really interesting story. Keep it up, my colleagues would love this. Ways to make Money Online Get Paid to Take Surveys Make Money Online

Calculation of bias further

Calculation of bias further accelerates FCS data acquisition both by allowing a determination of the minimum data acquisition time consistent with a specified bias level, and by enabling the correction of correlation functions obtained using brief data collection times that produce bias.we have derived analytical equations for calculating the variance and the bias for FCS autocorrelation functions and have validated them by comparison with experimental measurements. Regards, jacki - mahjong

Having a net income figure

Having a net income figure is great but it would be a lot more interesting how much income each share generates. Custom Essays. Regards,

I am from South Africa and

I am from South Africa and doing learnership in wealth management. I therefor am doing research on many things and would like to be assisted when asking for help. green tea. Regards,

The developer of Bollinger

The developer of Bollinger Bands, has created this website to provide information about his various investment services.

--------
online casinos | Best Mobile Casinos

The standard deviations of

The standard deviations of all the points on the correlation function are ... Koppel provided the first statistical analysis of the standard deviations for ... In the first part of this paper we have developed a theoretical framework// --------------------------- african american stereotypes l penis advantage review

The calculation of the

The calculation of the variance at each point of the correlation function eliminates the need for repetitive measurements to obtain this information and thereby facilitates determining optimal rate parameters by fitting measured to theoretical correlation functions. Calculation of bias further accelerates FCS data acquisition both by allowing a determination of the minimum data acquisition time consistent with a specified bias level, and by enabling the correction of correlation functions obtained using brief data collection times that produce bias. ( accounting degree | education degree )

we have derived analytical

we have derived analytical equations for calculating the variance and the bias for FCS autocorrelation functions and have validated them by comparison with experimental measurements. The most important consequence of this work is that using these equations one can substantially shorten the time required for acquisition of FCS data. ( online diploma | accredited degree | life experience degree )

You Have wriiten a great

You Have wriiten a great work sheet , I like this article very much , Thank you---------------------------------------------Bellevue chiropractor || des moines mortgage

One more

The last thing is that your columns H and N don't track very well against the Excel function. Sure, the latest cel backs up the function in column I and O, but if I fill the handwritten formula down, the handmade formula departs controlled flight very quickly. This formula should track the Excel function relatively closely as you're calculating a rolling value as opposed to a spot value.

Another question

In your worksheet, column M, the calculation of x minus x-bar times y minus y-bar has another anomaly. You've locked the formula so that all price values are subtracting the most recent mean value. If you're doing a rolling average, why would you subtract the latest mean? If anything, you'd be subtracting the mean of the whole range or at least the period n. I can't see why the price of SPN and CAM on 5/8/06 would be subtracting the mean of the data FORWARD over a year in July of 07. Can you explain your reasoning behind this?

Anomaly in your worksheet

In your Excel sheet, you calculate the SPN mean based on the n-day mean in cell B2 as your divisor. However, your CAM mean is based on the number of data points with cell B1 as your divisor. One is a 60 day mean, the other a 1 year mean. Shouldn't they both be the same, otherwise you're comparing apples to oranges? And what's the purpose of column H if you're using the STDEVA Excel function to calculate a rolling standard deviation?

Free Course

$5000 Practice Forex Trading Account

First Name:

Last Name:

Email Address:

Phone:

Software and live rates provided by FXCM

Ask An Expert