nsCouette介绍

The Taylor-Couette (TC) set-up is one of the most famous paradigmatic systems for wall-bounded shear flows in general and maybe the most important one if you are interested in rotating shear flows in particular. For understanding the following chapters of this documentation, it might help to acquaint yourself with a bit of crucial terminology and basic concepts of the TC system as well as with the notation we will use throughout this documentation and in the source code of nsCouette.

  • 代码来源:

John F. Gibson提出Plane Couette算法 http://channelflow.org/dokuwiki/doku.php?id=docs

https://github.com/epfl-ecps/channelflow

NONLINEAR DYNAMICS OF MODE COMPETITION IN ANNULAR FLOWS- 2008-MARC AVILA-phd-thesis

Direct Numerical Simulation of Transition to Turbulence and Turbulence Control in Pipe Flow- Baofang Song-phd-thesis

Governing equation and geometry1

Start from N-S function2

The naiver-Stoke model are very similar to the corresponding equation of the Euler equation. The different is additional terms that take into count the effect of internal friction and heat conduction in the fluid.

The momentum equation in this model is of the form of the Cauchy momentum equation:

$\sigma$ is the stress tensor, g contains all the body forces per unit mass.

Write in terms of equation in cylindrical coordinates:

where $vec{e}$ is a right-handed triad of unit vectors.

Material Derivative

Cauchy stress tensor3

其中,$\sigma$ is the tensor.

one term for example:

The final result, expressed as a tensor, is

Thus,

最后, 可以简化stress term$\nabla \cdot \sigma$ in N-S equation:3

带入上面的公式,

第一项-这里有一大堆整理,参考3

For an incompressible fluid, the term in the square brackets is equal to zero because of continuity equation.

同理,第二项,第三项

Incompressible N-S equation

In may subfields of fluid mechanics, the fluid is often considered to be incompressible, meaning that the density $\rho$ is constant. This assumption satisfied the mass conservation equation to:

Consequently, the bulk viscosity in N-S equation becomes irrelevant, as its terms is zero. For this reason, bulk viscosity is often neglected in fluid mechanics. Still, it relevant in acoustics and high-velocity compressible flow.

同时,2/3那一项也忽略掉。

建立不可压N-S方程, 这里假设$\rho$ is constant

其中,$\vec{u}(\vec{r},t)$ 是速度场, $p^h$ 是hydrodynamic pressure. 雷诺数$Re= \Omega r d/ \nu$, $d= r_0-r_i$, $\eta= r_i / r_o$, the length-to-gap aspect ratio $\Gamma = L_z/d$, $L_z$ 为柱子的高度。

—》由上面的详细推导过程,阐述了 $\nabla \cdot \sigma$的详细推导过程:

因此,公式最终可以整理成不可压的形式为(一些细节和约化参见附录1):

进一步,我们对其作dimensionless处理,通过d约化,$\tau=d^2/\nu$, 和 $\nu^2 / d^2$ 作为空间,时间的计量单位,压力通过$p=p^h / \rho$. 这样上述方程转化为,即为文献1中的公式(3):

  • 备注:后续可进一步简化为轴对称,去掉$\theta$ direction. 不再进行赘述。

Numerical simulation

空间离散

  1. Fourier-Galerkin method: 相当于每个时间步下,都要求解一次方程,算出不同l、n模态或者波数下所对应的r方向的chebshev系数。
    • 备注:这里面也许可以做一些文章。稳定性问题判定,通过eigenvalue。

其中, $kz$ 是最小波数, $k\theta$ 是简化的周期数。

将上面的离散形式带入到最终的推导公式,转化到每个(l,n):

其中,非线性项包含$u\cdot u$和$ u^2$项。

接下来做的是解耦径向和周向速度:

带入:

得到:

  1. finite-difference method in r方向的Chebyshev离散5

Use a standard high-order, central finite-difference method to approximate the radial derivatives.

  1. 时间离散格式-Adams-Bashforth back-ward-difference

    • 计算频域的非线性项系数$\hat{N}^i(u)$, 通过3/2-dealiasing rule:

      • Do matrix-vector multiplication to calculate $\partial_r\hat{u}^i$ (FD method) - chebfun: diff
      • ??Compute dot product in Fourier space to calculate $\partial_\theta \hat{u}^i$ and $\partial_z \hat{u}^i$
      • Perform Fourier transform of $\partial_{r,\theta,z}\hat{u}^i$ and $\hat{u}^i$ to obtain the velocity field and all its derivatives in physical space.
      • Calculate N^i(u)=u^i\cdot \triangle u^i$.
      • Perform inverse Fourier transform to obtain the spectral coefficients $\hat{N}^i(u)$.
    • Obtain the pressure prediction, $\hat{p}*$: solve the Possion equation, with consistent Neumann boundary condition.

    • Obtain the velocity prediction, $\hat{u}*$: solve the three three Helmholtz equations

      with Dirichlet boundary conditions.

    • Correct via an intermediate variable $\phi=2\triangle t (\hat{p}^{i+1}-\hat{p}*)/3$. The incompressibility condition $\nabla\cdot \hat{u}^{i+1}=0$ leads to a Possion equation for $\phi$ with homogeneous Neumann boundary conditions:

    • Compute pressure and velocity correction, $\hat{p}^{i+1}$ and $\hat{u}^{i+1}$:

    • Go back to step 1

    The Navier-Stokes equations are thus advanced in time by solving five systems of linear euqation, of Possion or Helmholtz type, for each Fourier mode.

附录1 :

在柱坐标系下:

令:

因:

得:

最终得到:

Boundary condition

压力项可以用Possion方程进行近似处理:

需要满足一定的边界条件:4

Install

环境:ubuntu 20.04

软件有三个branch,其中nsCouette-gpu 通过GPU计算,支持nivida的cuda-c。

具体操作是,主要需要按照库fftw,hdf5,libblas3, liblapack-dev

1
2
3
4
5
6
7
8
9
10
11
12
$ git clone http://github.com/dfeldmann/nsCouette.git
$ git clone -b nsPipe-1.0 https://github.com/dfeldmann/nsCouette.git
$ git clone -b nsCouette-gpu https://github.com/dfeldmann/nsCouette.git
$ cd nsCouette
$ sudo apt install libblas3 liblapack-dev
$ sudo apt install libhdf5-openmpi-dev
$ sudo apt install libfftw3-dev
# 需要进入ARCH文件夹中,修改make.arch.gcc-linux中hdf5的路径,
# LIBHDF5 = -L/usr/lib/x86_64-linux-gnu/hdf5/openmpi -lhdf5_fortran -lhdf5 -lz //-L表示lib库
# INCLUDES = -I/usr/include/hdf5/openmpi //-I表示include
$ make ARCH=gcc-linux
$ cd gcc-linux && ./nsCouette.x

nsCouette-gpu环境配置(类似C语言):

1
2
3
4
5
6
7
8
9
# 首先得有Nivida的显卡,安装好cuda环境
# 修改Makefile
# all: $(GPU_OBJECTS)
# $(NVCC) -o taylorC $(GPU_OBJECTS) $(LIBS) -L/usr/lib/x86_64-linux-gnu/hdf5/openmpi
#$ (GPU_OBJECTS): %.o: %.cu
# $(NVCC) -c $< -o $@ -I/usr/include/hdf5/openmpi -I/usr/lib/x86_64-linux-gnu/openmpi/include
# 该修改根据自己openmpi和hdf5的安装路径进行选择,如果按照上面库的安装,该路径如上所示。
# 本机只有一个独立显卡,因此还需要修改main.cu 函数的”dev=0“。
$ make clean && make

代码解析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
// main.cu 代码
#include"TayC.h"

int main( int argc, const char* argv[]){

printf("\n+++++++++++++++++++++++++++");
printf("\nStarting GPU Taylor-Couette");
printf("\n+++++++++++++++++++++++++++");

int dev=0; //根据自己主机的GPU数目来定
double time;

printf("\n");
printf("\nSetting device: %d",dev);
CHECK_CUDART(cudaSetDevice(dev));

//Timing variables
clock_t start, end;
double gpu_time_used;

//Set up
size_p sizes;

sizes.Nr=NR; //R方向的模态数
sizes.Nt=NT; //theta方向的模态数
sizes.Nz=NZ; //z方向的模态数

//Initialize the mesh
double* grid_mesh=(double*)malloc(NR*sizeof(double));

double r_i = 1.0; //ETA/(1.0-ETA);
double r_o = 1.0/ETA; //1.0/(1.0-ETA);


//Chebyshev mesh in r

for(int j=0;j<NR;j++){
//grid_mesh[j]=2.0*(double)j/(double)(NR-1)+1.0;
grid_mesh[j]= (r_i+r_o)/2.0 - cos(PI2/2.0*j/(NR-1))/2.0;

}

//Write mesh

//Set modules
setImplic_7_exp(grid_mesh,sizes);
setDeriv_9_exp(grid_mesh,sizes);
setBoundary(grid_mesh);
setNonlinear(grid_mesh);
setFft(sizes);
setCublas();
setIntegrator(sizes);
setLinear(grid_mesh);
setStatistics(grid_mesh);

//Allocate memory buffers
vfield u, uw, rhs, rhsw;

size_t size_p=NR*NT*NZ*sizeof(double2);

CHECK_CUDART(cudaMalloc(&u.r,size_p));
CHECK_CUDART(cudaMalloc(&u.t,size_p));
CHECK_CUDART(cudaMalloc(&u.z,size_p));

CHECK_CUDART(cudaMalloc(&uw.r,size_p));
CHECK_CUDART(cudaMalloc(&uw.t,size_p));
CHECK_CUDART(cudaMalloc(&uw.z,size_p));

CHECK_CUDART(cudaMalloc(&rhs.r,size_p));
CHECK_CUDART(cudaMalloc(&rhs.t,size_p));
CHECK_CUDART(cudaMalloc(&rhs.z,size_p));

CHECK_CUDART(cudaMalloc(&rhsw.r,size_p));
CHECK_CUDART(cudaMalloc(&rhsw.t,size_p));
CHECK_CUDART(cudaMalloc(&rhsw.z,size_p));


//Start initial field
initField(u, grid_mesh);

//Or read check point
//readCheckpoint(u,grid_mesh,&time,"./checkpoint.h5");

//Time-step
printf("\nParameter");
printf("\nr_o,r_i=%e,%e",r_o,r_i);
printf("\nRe_i,Re_o=%e,%e",REYNOLDS_OUTER,REYNOLDS_INNER);

double U_i=REYNOLDS_OUTER*r_o;
double U_o=REYNOLDS_INNER*r_i;

double t_i=PI2*r_i/NT/U_i;
double t_o=PI2*r_o/NT/U_o;

printf("\nt_i,t_o=%e,%e",t_o,t_i);

double dt=0.1*min(abs(t_i),abs(t_o));

if(VARIABLE_DT){
printf("\nRunning with fixed Courant number=%e",COURANT);
}else{
printf("\nRunning with fixed Dt=%e",dt);
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
// TayC.h 代码块
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <cublas_v2.h>
#include <cufft>
#include <time.h>
#include <sys/time.h>
#include <hdf5.h>
#include <hdf5_hl.h>


#define CHECK_CUDART(x) do { \
cudaError_t res = (x); \
if(res != cudaSuccess) { \
fprintf(stderr, " CUDART: %s = %d (%s) at (%s:%d)\n", #x, res, cudaGetErrorString(res),__FILE__,__LINE__); \
exit(1); \
} \
} while(0)

#define NZ 65 //Number of grid points in real space is 2*(NZ-1)
#define NT 64 //Number of grid points in real space is NT
#define NR 64 //Number of grid points in real space is NR

#define ETA (0.5) //eta=inner_radius/outer_radius, eta<1

#define NZP (3*(NZ-1)/2+1)
#define NTP (3*NT/2)

#define REYNOLDS_INNER (-700.0)
#define REYNOLDS_OUTER ( 700.0)

#define PI2 6.283185307179586
#define LT PI2
#define LZ (PI2/2)

//Number of steps to perform non-linear convolution: save memory
#define NSTEPS_CONV 1

//Size of the influence matrix
#define MAT 8

//Parameter of the implicit time-integrator
#define C_IMPLICIT (0.51)

//Variable dt
#define VARIABLE_DT 1
#define MAXDT 0.01
#define COURANT 0.25
#define TOLERANCE_DTERR 5e-5

typedef struct { double2* r;double2* t;double2* z;} vfield;
typedef struct { int Nr; int Nt; int Nz; double Lt ; double Lz;} size_p;

//Globals
extern double2* AUX_T;
extern double Cfl_r, Cfl_z, Cfl_th;

//Finite differences
void fd_weigths(double x0,double* x,int n,double* L1,double*L2);

//Cublas
void setCublas(void);
void transpose_A(double2* u_2,double2* u_1);
void transpose_B(double2* u_2,double2* u_1);
void transpose_infbackward(double* u_2,double* u_1,int mr);
void transpose_infforward(double* u_2,double* u_1,int mr);
void invert_infmatrix(double* src);

//Derivatives compact 3-3
void setDeriv_3_comp(double* mesh,size_p sizes);
void deriv_R_3C(double2* u,double2* aux);
void deriv_RR_3C(double2* u, double2* aux);

//Derivatives explicit 9
void setDeriv_9_exp(double* mesh,size_p sizes);
void deriv_R_9E(double2* u);
void deriv_RR_9E(double2* u);

//Implicit steps explicit 7
void setImplic_7_exp(double* mesh,size_p sizes);
void implicit_step(double2* u, double2* aux1,double* aux2, double* aux3, double* aux4,
double d_implicit,double dt,int flag);
void calcPressure(double2* p, vfield u, double2* dr,
double2* aux1,double* aux2, double* aux3, double* aux4);
void deriv_R_7E(double2* u);

//IO
void writeBufferBinary(double* w,const char* file,size_t elsize, size_t elements);
void writeBufferBinary(double* w,const char* file,size_t elsize, size_t elements);
void writeBufferBinaryGPU(double* w,const char* file,size_t elsize,size_t elements);
void writeCheckpoint(vfield u, double* grid_mesh,double* time,char* fileName);
void readCheckpoint(vfield u, double* grid_mesh,double* time,char* fileName);
void writeFieldVis(vfield u, double* grid_mesh,double* time,char* fileName);

//fft
void setFft(size_p sizes);
void fftDestroy(void);
void fftForward(double2* buffer);
void fftBackward(double2* buffer);
void fftForward_reduced(double2* buffer);
void fftBackward_reduced(double2* buffer);


//Utils
void copyVfield(vfield u1, vfield u2);
void onesGPU(double2* u1, size_t elements);
void normalize(double2* u, double norm, size_t elements);
void decoupleForward(vfield u);
void decoupleBackward(vfield u);
void copyBuffer(double2* u1, double2* u2);
void updateNonlinear(vfield n1, vfield n2, double d_implicit);
void calcnorm(double2* u_out, double2* u_in);
void setZeromode(double2* u);
void initField(vfield u, double* mesh);
void writeBufferBinaryGPUreal(double* w,const char* file,
size_t elsize,size_t elements);
void zerosGPU(double2* u1, size_t elements);
void diffAbs(double2* u_diff, double2* u);
void fieldAbs(double2* u_diff, double2* u);


//Non-linear
void setNonlinear(double* mesh);
void nonLinear(vfield u, vfield dur,int flag, double time);


//Check routines
void checkDerivatives(void);
void checkNonlinear(void);
void checkImplicit(void);
void checkPressure(void);
void checkLaplacian(void);
void checkTranspose(void);
void checkBoundaries(void);

//Linear
void setLinear(double* mesh);
void pressureProject(vfield rhs);
void implictStep(vfield u, vfield rhs, double d_implicit, double dt);
void calcLaplacian(vfield rhs, vfield u, double d_implicit, double dt);
void calcPressure_hom(double2* u,double2* aux1,double* aux2, double* aux3, double* aux4
,double2 Bci, double2 Bco);
void implicit_step_hom(double2* u, double2* aux1,double* aux2, double* aux3,
double* aux4, double d_implicit,double dt,int flag,double2 Bci, double2 Bco);
void calcDivergence(double2* div, vfield u);
void calcVorticity(vfield vor, vfield u);
void forcing(double2* uz);

//Boundary conditions
void setBoundary(double* mesh);
void calcHsol(double* M, vfield hsolu_i, vfield hsolu_o, vfield hsolp_i, vfield hsolp_o,
double d_implicit, double dt);
void imposeBoundarycond(vfield u, vfield hsolu_i, vfield hsolu_o, vfield hsolp_i,
vfield hsolp_o, double* M);

//Time integrator
void integrate(vfield u, vfield u_w, vfield rhs, vfield rhs_w,
int nsteps, double dt,double* time);
void setIntegrator(size_p sizes);
void check_convergence(int* iter, double dt);
void new_step(double* dt);
void measurecorr(vfield up, vfield um);

//Statistics
void setStatistics(double* mesh);
void readTorq(double2* ut,double2* utdr,double time);

//Non-linear
void padForward(double2* aux,double2* u);
void padBackward(double2* u,double2* aux);
void calcDut(vfield u, vfield du);
void calcDuz(vfield u, vfield du);

//Check cublas
void cublasCheck(cublasStatus_t error, const char* function);

1. A hybrid MPI-OpenMP parallel implementation for pseudo-spectral simulations with application to Taylor-Couette flow
2. The lattice Boltzmann method: Fundamentals and acoustcis
3. https://github.com/demichie/NS_cylindrical/blob/master/NS_.cylindrical.pdf
4. ON PRESSURE BOUNDARY CONDITIONS FOR THE INCOMPRESSIBLE NAVIER-STOKES EQUATIONS*
5. A Practical Guide to Pseudospectral Methods; Cambridge