CFD Online Logo CFD Online URL
www.cfd-online.com
[Sponsors]
Home > Wiki > Introduction to turbulence/Statistical analysis/Multivariate r...

Introduction to turbulence/Statistical analysis/Multivariate random variables

From CFD-Wiki

< Introduction to turbulence | Statistical analysis(Difference between revisions)
Jump to: navigation, search
(Joint pdfs and joint moments)
 
(26 intermediate revisions not shown)
Line 1: Line 1:
-
=== Joint pdfs and joint moments ===
+
{{Introduction to turbulence menu}}
 +
== Joint pdfs and joint moments ==
Often it is importamt to consider more than one random variable at a time. For example, in turbulence the three components of the velocity vector are interralated and must be considered together. In addition to the ''marginal'' (or single variable) statistical moments already considered, it is necessary to consider the '''joint''' statistical moments.
Often it is importamt to consider more than one random variable at a time. For example, in turbulence the three components of the velocity vector are interralated and must be considered together. In addition to the ''marginal'' (or single variable) statistical moments already considered, it is necessary to consider the '''joint''' statistical moments.
Line 5: Line 6:
For example if <math>u</math> and <math>v</math> are two random variables, there are three second-order moments which can be defined <math>\left\langle u^{2} \right\rangle </math> , <math>\left\langle v^{2} \right\rangle </math> , and <math>\left\langle uv \right\rangle </math>. The product moment <math>\left\langle uv \right\rangle </math> is called the ''cross-correlation'' or  ''cross-covariance''. The moments <math>\left\langle u^{2} \right\rangle </math> and <math>\left\langle v^{2} \right\rangle </math> are referred to as the ''covariances'', or just simply the ''variances''. Sometimes <math>\left\langle uv \right\rangle </math> is also referred to as the ''correlation''.
For example if <math>u</math> and <math>v</math> are two random variables, there are three second-order moments which can be defined <math>\left\langle u^{2} \right\rangle </math> , <math>\left\langle v^{2} \right\rangle </math> , and <math>\left\langle uv \right\rangle </math>. The product moment <math>\left\langle uv \right\rangle </math> is called the ''cross-correlation'' or  ''cross-covariance''. The moments <math>\left\langle u^{2} \right\rangle </math> and <math>\left\langle v^{2} \right\rangle </math> are referred to as the ''covariances'', or just simply the ''variances''. Sometimes <math>\left\langle uv \right\rangle </math> is also referred to as the ''correlation''.
-
In a manner similar to that used to build-up the probabilility density function from its measurable counterpart, the histogram, a '''joint probability density function''' (or '''jpdf'''),<math>B_{uv}</math> , can be built-up from the ''joint histogram''. Figure 2.5 illustrates several examples of jpdf's which have different cross correlations. For convenience the fluctuating variables <math>u'</math> and <math>v'</math> can be defined as  
+
In a manner similar to that used to build-up the probabilility density function from its measurable counterpart, the histogram, a '''joint probability density function''' (or '''jpdf'''),<math>B_{uv}</math> , can be built-up from the ''joint histogram''. <font color="orange">Figure 2.5</font> illustrates several examples of jpdf's which have different cross correlations. For convenience the fluctuating variables <math>u'</math> and <math>v'</math> can be defined as  
 +
 
 +
:<math>u' = u - U</math> 
-
<table width="100%"><tr><td> 
+
:<math>v' = v - V</math>   
-
:<math>  
+
-
u' = u - U
+
-
</math> 
+
-
</td><td width="5%">(2)</td></tr></table>
+
-
 
+
-
<table width="100%"><tr><td> 
+
-
:<math>   
+
-
v' = v - V
+
-
</math>   
+
-
</td><td width="5%">(2)</td></tr></table>
+
where as before capital letters are usd to represent the mean values. Clearly the fluctuating quantities <math>u'</math> and <math>v'</math> are random variables with zero mean.
where as before capital letters are usd to represent the mean values. Clearly the fluctuating quantities <math>u'</math> and <math>v'</math> are random variables with zero mean.
Line 25: Line 18:
It is sometimes more convinient to deal with values of the cross-variances which have ben normalized by the appropriate variances. Thus the ''correlation coefficient'' is defined as:
It is sometimes more convinient to deal with values of the cross-variances which have ben normalized by the appropriate variances. Thus the ''correlation coefficient'' is defined as:
-
<table width="100%"><tr><td> 
 
:<math>     
:<math>     
\rho_{uv}\equiv \frac{ \left\langle  u'v' \right\rangle}{ \left[ \left\langle u'^{2} \right\rangle \left\langle  v'^{2} \right\rangle \right]^{1/2}}
\rho_{uv}\equiv \frac{ \left\langle  u'v' \right\rangle}{ \left[ \left\langle u'^{2} \right\rangle \left\langle  v'^{2} \right\rangle \right]^{1/2}}
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
+
 
 +
<font color="orange" size="3">Figure 2.5 not uploaded yet</font>
The correlation coefficient is bounded by plus or minus one, the former representing perfect correlation and the latter perfect anti-correlation.
The correlation coefficient is bounded by plus or minus one, the former representing perfect correlation and the latter perfect anti-correlation.
Line 35: Line 28:
As with the single-variable pdf, there are certain conditions the joint probability density function must satisfy. If <math>B_{uv}\left( c_{1}c_{2} \right)</math> indicates the jpdf of the random variables <math>u</math> and <math>v</math>,  then:
As with the single-variable pdf, there are certain conditions the joint probability density function must satisfy. If <math>B_{uv}\left( c_{1}c_{2} \right)</math> indicates the jpdf of the random variables <math>u</math> and <math>v</math>,  then:
-
* '''Property 1'''
+
 
-
<table width="100%"><tr><td> 
+
* '''Property 1''': <math>     
-
:<math>     
+
B_{uv}\left( c_{1}c_{2} \right) > 0
B_{uv}\left( c_{1}c_{2} \right) > 0
 +
</math>, always
 +
 +
 +
* '''Property 2''': <math>   
 +
Prob \left\{ c_{1} < u < c_{1} + dc_{1} , c_{2} < v < c_{2} + dc_{2} \right\} = B_{uv}\left( c_{1}c_{2} \right) dc_{1}, dc_{2}
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
-
always
 
 +
* '''Property 3''': <math>   
 +
\int^{\infty}_{ - \infty} \int^{\infty}_{ - \infty}  B_{uv}\left( c_{1}c_{2} \right) dc_{1} dc_{2} = 1
 +
</math> 
 +
 +
 +
* '''Property 4''': <math>   
 +
\int^{\infty}_{ - \infty} B_{uv}\left( c_{1}c_{2} \right) dc_{2} = B_{u}\left( c_{1} \right)
 +
</math>, where <math>B_{u}</math> is a function of <math>c_{1}</math> only
 +
 +
 +
* '''Property 5''': <math>   
 +
\int^{\infty}_{ - \infty} B_{uv}\left( c_{1}c_{2} \right) dc_{1} = B_{v}\left( c_{2} \right)
 +
</math>, where <math>B_{v}</math> is a function of <math>c_{2}</math> only
 +
 +
 +
The functions <math>B_{u}</math> and <math>B_{v}</math> are called the ''marginal probability density functions'' and they are simply the single variable pdf's defined earlier. The subscript is used to indicate which variable is left after the others are integrated out. Note that <math> B_{u}\left( c_{1} \right) </math> is not the same as <math> B_{uv}\left( c_{1},0 \right) </math>. The latter is only a slice through the <math>c_{2}</math> - axis, whale the marginal distribution is weighted by the integral of the distribution of the other variable. <font color="orange">Figure 2.6</font>. illustrates these differences.
-
* '''Property 2'''
+
If the joint probability density function is known, the ''joint moments'' of all orders can be determined. Thus the <math>m,n</math> -th joint moment is
-
<table width="100%"><tr><td>   
+
    
:<math>     
:<math>     
-
Prob \left\{ c_{1} < u < c_{1} + dc_{1} , c_{2} < v < c_{2} + dc_{2} \right\} = B_{uv}\left( c_{1}c_{2} \right) dc_{1}, dc_{2}
+
\left\langle \left( u- U \right)^{m} \left( v - V \right)^n \right\rangle = \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} \left( c_{1} - U \right)^{m} \left( c_{2} - V \right)^{n} B_{uv}\left( c_{1} , c_{2} \right) dc_{1} dc_{2}
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
-
* '''Property 3'''
+
<font color="orange" size="3">Figure 2.6 not uploaded yet</font>
-
<table width="100%"><tr><td>   
+
 
 +
In the preceding discussions, only two random variables have been considered. The definitions, however, can easily be geberalized to accomodate any number of random variables. In addition, the joint statistics of a single random at different times or at different points in space could be considered. This will be done later when stationary and homogeneous random processes are considered.
 +
 
 +
== The bi-variate normal (or Gaussian) distribution ==
 +
 
 +
If <math>u</math> and <math>v</math> are ''normally'' distributed random variables with standard deviations given by <math>\sigma_{u}</math> and <math>\sigma_{v}</math> respectively , with correlation coefficient <math>\rho_{uv}</math>, then their joint probability density function is given by
 +
    
:<math>     
:<math>     
-
\int^{\infty}_{ - \infty} \int^{\infty}_{ - \infty}  B_{uv}\left( c_{1}c_{2} \right) dc_{1} dc_{2} = 1
+
B_{uvG} \left(c_{1},c_{2} \right) = \frac{1}{2 \pi \sigma_{u} \sigma_{v} }exp \left[ \frac{ \left( c_{1} - U \right)^{2} }{ 2\sigma^{2}_{u} } + \frac{ \left( c_{2}-V \right)^{2}}{2\sigma^{2}_{v} } - \rho_{uv}\frac{c_{1}c_{2}}{\sigma_{u} \sigma_{v}}  \right]
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
-
* '''Property 4'''
+
This distribution is plotted in <font color="orange">Figure 2.7</font>. for several values of <math>\rho_{uv}</math> where <math>u</math> and <math>v</math> are assumed to be identically distributed (i.e., <math> \left\langle u^{2} \right\rangle = \left\langle v^{2} \right\rangle </math> ).
-
<table width="100%"><tr><td> 
+
 
 +
It is straightforward to show (by completing the square and integrating) that this yields the single variable Gaussian distribution for the marginal distributions. It is also possible to write a ''multivariate Gaussian'' probability density function for any number of random variables.
 +
 
 +
<font color="orange" size="3">Figure 2.7 not uploaded yet</font>
 +
 
 +
== Statistical independence and lack of correlation ==
 +
 
 +
'''Definition: Statistical Independence''' Two random variables are said to be ''statistically independent'' if their joint probability density is equal to the product of their marginal probability density functions. That is,
 +
:<math>     
:<math>     
-
\int^{\infty}_{ - \infty} B_{uv}\left( c_{1}c_{2} \right) dc_{2} = B_{u}\left( c_{1} \right)
+
B_{uv}\left(c_{1}, c_{2} \right) = B_{u}\left(c_{1} \right) B_{v} \left( c_{2} \right)
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
-
where <math>B_{u}</math> is a function of <math>c_{1}</math> only
+
It is easy to see that statistical independence implies a complete lack of correlation; i.e., <math> \rho_{uv} \equiv 0 </math>. From the definition of the cross-correlation
 +
 
 +
:<math>  
 +
\begin{matrix}
 +
\left\langle \left(u-U \right) \left( v - V \right) \right\rangle & = & \int ^{\infty}_{-\infty} \int ^{\infty}_{-\infty} \left( c_{1} - U \right) \left( c_{2} - V \right) B_{uv} \left( c_{1} , c_{2} \right) dc_{1} dc_{2} \\
 +
& = & \int ^{\infty}_{-\infty} \int ^{\infty}_{-\infty} \left( c_{1} - U \right) \left( c_{2} - V \right) B_{u}\left(c_{1} \right) B_{v} \left( c_{2} \right) dc_{1} dc_{2} \\
 +
& = & \int ^{\infty}_{-\infty} \left(c_{1} - U \right) B_{u}\left(c_{1} \right) dc_{1}  \int ^{\infty}_{-\infty} \left( c_{2} - V \right) B_{v} \left( c_{2} \right)  dc_{2} \\
 +
& = & 0
 +
\end{matrix}
 +
</math>  
-
* '''Property 5'''
+
where we have used the equation for <math>B_{uv}\left(c_{1}, c_{2} \right)</math> above since the first central moments are zero by definiion.
-
<table width="100%"><tr><td>   
+
 
 +
It is important to note that the inverse is not true - ''lack of correlation does not imply statistical independence!'' To see this consider two identically distributed random variables, <math>u'</math> and <math>v'</math>, which have zero means and non-zero correlation <math> \left\langle u'v' \right\rangle </math>. From these two correlated random variables two other random variables <math>x</math> and <math>y</math>, can be formed as
 +
 
 +
:<math>x = u' + v'</math>  
 +
 
 +
:<math>y = u' - v'</math>  
 +
 
 +
Clearly <math>x</math> and <math>y</math> are ''not'' statistically independent. They are, however, ''uncorrelated'' because:
 +
    
:<math>     
:<math>     
-
\int^{\infty}_{ - \infty} B_{uv}\left( c_{1}c_{2} \right) dc_{1} = B_{v}\left( c_{2} \right)
+
\begin{matrix}
 +
\left\langle xy \right\rangle & = & \left\langle  \left( u'+ v' \right) \left( u' - v' \right) \right\rangle \\
 +
& = & \left\langle u'^{2} \right\rangle + \left\langle u'v' \right\rangle - \left\langle u'v' \right\rangle - \left\langle v'^{2} \right\rangle \\
 +
& = & 0 \\
 +
\end{matrix}
</math>   
</math>   
-
</td><td width="5%">(2)</td></tr></table>
 
-
where <math>B_{v}</math> is a function of <math>c_{2}</math> only
+
since <math>u'</math> and <math>v'</math> are identically distributed (and as a consequence <math> \left\langle u'^{2} \right\rangle  = \left\langle v'^{2} \right\rangle </math> ).
 +
 
 +
<font color="orange">Figure 2.8</font> illustrates the change of variables carried out above. The jpdf resulting from the transformation is symmetric about both axes, thereby eliminating the correlation. Transformation, however, does not insure that the distribution is separable, i.e., <math> B_{x,y} \left( a_{1},a_{2} \right) = B_{x} \left( a_{1} \right) B_{y} \left( a_{2} \right) </math>, as required for statistical independence.
-
The functions <math>B_{u}</math> and <math>B_{v}</math> are called the ''marginal probability density functions'' and they are simply the single variable pdf's defined earlier. The subscript is used to indicate which variable is left after
+
<font color="orange" size="3">Figure 2.8 not uploaded yet</font>
-
=== The bi-variate normal (or Gaussian) distribution ===
+
{{Turbulence credit wkgeorge}}
-
dssd
+
{{Chapter navigation|Probability|Estimation from a finite number of realizations}}

Latest revision as of 12:41, 21 June 2007

Introduction to turbulence
Nature of turbulence
Statistical analysis
Reynolds averaged equation
Turbulence kinetic energy
Stationarity and homogeneity
Homogeneous turbulence
Free turbulent shear flows
Wall bounded turbulent flows
Study questions

... template not finished yet!

Contents

Joint pdfs and joint moments

Often it is importamt to consider more than one random variable at a time. For example, in turbulence the three components of the velocity vector are interralated and must be considered together. In addition to the marginal (or single variable) statistical moments already considered, it is necessary to consider the joint statistical moments.

For example if u and v are two random variables, there are three second-order moments which can be defined \left\langle u^{2} \right\rangle , \left\langle v^{2} \right\rangle , and \left\langle uv \right\rangle . The product moment \left\langle uv \right\rangle is called the cross-correlation or cross-covariance. The moments \left\langle u^{2} \right\rangle and \left\langle v^{2} \right\rangle are referred to as the covariances, or just simply the variances. Sometimes \left\langle uv \right\rangle is also referred to as the correlation.

In a manner similar to that used to build-up the probabilility density function from its measurable counterpart, the histogram, a joint probability density function (or jpdf),B_{uv} , can be built-up from the joint histogram. Figure 2.5 illustrates several examples of jpdf's which have different cross correlations. For convenience the fluctuating variables u' and v' can be defined as

u' = u - U
v' = v - V

where as before capital letters are usd to represent the mean values. Clearly the fluctuating quantities u' and v' are random variables with zero mean.

A positive value of \left\langle u'v' \right\rangle indicates that u' and v' tend to vary together. A negative value indicates value indicates that when one variable is increasing the other tends to be decreasing. A zero value of \left\langle u'v' \right\rangle indicates that there is no correlation between u' and v'. As will be seen below, it does not mean that they are statistically independent.

It is sometimes more convinient to deal with values of the cross-variances which have ben normalized by the appropriate variances. Thus the correlation coefficient is defined as:

    
\rho_{uv}\equiv \frac{ \left\langle  u'v' \right\rangle}{ \left[ \left\langle u'^{2} \right\rangle \left\langle  v'^{2} \right\rangle \right]^{1/2}}

Figure 2.5 not uploaded yet

The correlation coefficient is bounded by plus or minus one, the former representing perfect correlation and the latter perfect anti-correlation.

As with the single-variable pdf, there are certain conditions the joint probability density function must satisfy. If B_{uv}\left( c_{1}c_{2} \right) indicates the jpdf of the random variables u and v, then:


  • Property 1:     
B_{uv}\left( c_{1}c_{2} \right) > 0
, always


  • Property 2:     
Prob \left\{ c_{1} < u < c_{1} + dc_{1} , c_{2} < v < c_{2} + dc_{2} \right\} = B_{uv}\left( c_{1}c_{2} \right) dc_{1}, dc_{2}


  • Property 3:     
\int^{\infty}_{ - \infty} \int^{\infty}_{ - \infty}  B_{uv}\left( c_{1}c_{2} \right) dc_{1} dc_{2} = 1


  • Property 4:     
\int^{\infty}_{ - \infty} B_{uv}\left( c_{1}c_{2} \right) dc_{2} = B_{u}\left( c_{1} \right)
, where B_{u} is a function of c_{1} only


  • Property 5:     
\int^{\infty}_{ - \infty} B_{uv}\left( c_{1}c_{2} \right) dc_{1} = B_{v}\left( c_{2} \right)
, where B_{v} is a function of c_{2} only


The functions B_{u} and B_{v} are called the marginal probability density functions and they are simply the single variable pdf's defined earlier. The subscript is used to indicate which variable is left after the others are integrated out. Note that  B_{u}\left( c_{1} \right) is not the same as  B_{uv}\left( c_{1},0 \right) . The latter is only a slice through the c_{2} - axis, whale the marginal distribution is weighted by the integral of the distribution of the other variable. Figure 2.6. illustrates these differences.

If the joint probability density function is known, the joint moments of all orders can be determined. Thus the m,n -th joint moment is

    
\left\langle \left( u- U \right)^{m} \left( v - V \right)^n \right\rangle = \int^{\infty}_{-\infty} \int^{\infty}_{-\infty} \left( c_{1} - U \right)^{m} \left( c_{2} - V \right)^{n} B_{uv}\left( c_{1} , c_{2}  \right) dc_{1} dc_{2}

Figure 2.6 not uploaded yet

In the preceding discussions, only two random variables have been considered. The definitions, however, can easily be geberalized to accomodate any number of random variables. In addition, the joint statistics of a single random at different times or at different points in space could be considered. This will be done later when stationary and homogeneous random processes are considered.

The bi-variate normal (or Gaussian) distribution

If u and v are normally distributed random variables with standard deviations given by \sigma_{u} and \sigma_{v} respectively , with correlation coefficient \rho_{uv}, then their joint probability density function is given by

    
B_{uvG} \left(c_{1},c_{2} \right) = \frac{1}{2 \pi \sigma_{u} \sigma_{v} }exp \left[ \frac{ \left( c_{1} - U \right)^{2} }{ 2\sigma^{2}_{u} } + \frac{ \left( c_{2}-V \right)^{2}}{2\sigma^{2}_{v} } - \rho_{uv}\frac{c_{1}c_{2}}{\sigma_{u} \sigma_{v}}  \right]

This distribution is plotted in Figure 2.7. for several values of \rho_{uv} where u and v are assumed to be identically distributed (i.e.,  \left\langle u^{2} \right\rangle = \left\langle v^{2} \right\rangle ).

It is straightforward to show (by completing the square and integrating) that this yields the single variable Gaussian distribution for the marginal distributions. It is also possible to write a multivariate Gaussian probability density function for any number of random variables.

Figure 2.7 not uploaded yet

Statistical independence and lack of correlation

Definition: Statistical Independence Two random variables are said to be statistically independent if their joint probability density is equal to the product of their marginal probability density functions. That is,

    
B_{uv}\left(c_{1}, c_{2} \right) = B_{u}\left(c_{1} \right) B_{v} \left( c_{2} \right)

It is easy to see that statistical independence implies a complete lack of correlation; i.e.,  \rho_{uv} \equiv 0 . From the definition of the cross-correlation

    
\begin{matrix}
\left\langle \left(u-U \right) \left( v - V \right) \right\rangle & = & \int ^{\infty}_{-\infty} \int ^{\infty}_{-\infty} \left( c_{1} - U \right) \left( c_{2} - V \right) B_{uv} \left( c_{1} , c_{2} \right) dc_{1} dc_{2} \\
 & = & \int ^{\infty}_{-\infty} \int ^{\infty}_{-\infty} \left( c_{1} - U \right) \left( c_{2} - V \right) B_{u}\left(c_{1} \right) B_{v} \left( c_{2} \right) dc_{1} dc_{2} \\
 & = & \int ^{\infty}_{-\infty} \left(c_{1} - U \right) B_{u}\left(c_{1} \right) dc_{1}  \int ^{\infty}_{-\infty} \left( c_{2} - V \right) B_{v} \left( c_{2} \right)  dc_{2} \\
 & = & 0
\end{matrix}

where we have used the equation for B_{uv}\left(c_{1}, c_{2} \right) above since the first central moments are zero by definiion.

It is important to note that the inverse is not true - lack of correlation does not imply statistical independence! To see this consider two identically distributed random variables, u' and v', which have zero means and non-zero correlation  \left\langle u'v' \right\rangle . From these two correlated random variables two other random variables x and y, can be formed as

x = u' + v'
y = u' - v'

Clearly x and y are not statistically independent. They are, however, uncorrelated because:

    
\begin{matrix}
\left\langle xy \right\rangle & = & \left\langle  \left( u'+ v' \right) \left( u' - v' \right) \right\rangle \\
& = & \left\langle u'^{2} \right\rangle + \left\langle u'v' \right\rangle - \left\langle u'v' \right\rangle - \left\langle v'^{2} \right\rangle \\
& = & 0 \\
\end{matrix}

since u' and v' are identically distributed (and as a consequence  \left\langle u'^{2} \right\rangle  = \left\langle v'^{2} \right\rangle ).

Figure 2.8 illustrates the change of variables carried out above. The jpdf resulting from the transformation is symmetric about both axes, thereby eliminating the correlation. Transformation, however, does not insure that the distribution is separable, i.e.,  B_{x,y} \left( a_{1},a_{2} \right) = B_{x} \left( a_{1} \right) B_{y} \left( a_{2} \right) , as required for statistical independence.

Figure 2.8 not uploaded yet

Credits

This text was based on "Lectures in Turbulence for the 21st Century" by Professor William K. George, Professor of Turbulence, Chalmers University of Technology, Gothenburg, Sweden.

Probability · Estimation from a finite number of realizations
Probability · Introduction to turbulence/Statistical analysis · Estimation from a finite number of realizations
My wiki