#### 1. Introduction

This Fortran program runs a general finite difference model of the advection-dispersion equation. It uses coarrays to allow for parallelization.

WARNING: THIS CODE DOES NOT CURRENTLY WORK WITH > 1 IMAGE.

Planning to fix this and do part 3 over the weekend.

#### 2. Equations

\frac{\partial c}{\partial t} = D\frac{\partial^{2} c}{\partial x^{2}} - v\frac{\partial c}{\partial x}

where D is the dispersion coefficient, and v is the average linear flow velocity.

The general equation for finite difference is

\begin{aligned}\frac{c_i(t+\Delta t) - c_i}{\Delta t}&= \theta\left[D\frac{{c}_{i+1}{(t + \Delta t)} -2{c}_i{(t + \Delta t)} + {c}_{i-1}{(t + \Delta t)}}{(\Delta {x})^2}- v\frac{{c}_{i+1}{(t + \Delta t)} - {c}_{i-1}{(t + \Delta t)}}{2\Delta {x}}\right] \\ &\quad + (1-\theta)\left[D\frac{{c}_{i+1}{(t)} -2{c}_i{(t)} + {c}_{i-1}{(t)}}{(\Delta {x})^2}- v\frac{{c}_{i+1}{(t)} - {c}_{i-1}{(t)}}{2\Delta {x}}\right]\end{aligned}

Rearranged,

-\theta Ac_{i-1}(t + \Delta t) + B^\ast c_i(t + \Delta t) - \theta Cc_{i+1}(t + \Delta t) = (1 - \theta)Ac_{i-1}(t) + B^{\ast\ast}c_i(t) + (1 - \theta)Cc_{i+1}(t)

where

\begin{aligned} A &= \frac{D\Delta t}{(\Delta x)^2} + \frac{v\Delta t}{2\Delta x} \\ B^\ast &= 1 + 2\theta\frac{D\Delta t}{(\Delta x)^2} \\ B^{\ast\ast} &= 1 + 2(\theta - 1)\frac{D\Delta t}{(\Delta x)^2} \\ C &= \frac{D\Delta t}{(\Delta x)^2} - \frac{v\Delta t}{2\Delta x}\end{aligned}

Converting to matrices,

\begin{bmatrix} 1& {}& {}& {}& {}& {}& {} \\ -\theta A & B^\ast & -\theta C& {}& {}& {}& {} \\ {} & -\theta A & B^\ast & -\theta C& {}& {}& {} \\ {} & {} & . & . & . & {} & {} \\ {} & {} & {} & -\theta A & B^\ast & -\theta C & {} \\ {} & {} & {} & {} & -\theta A & B^\ast & -\theta C \\ {} & {} & {} & {} & {} & {} & 1\end{bmatrix}\begin{bmatrix} c_0(t + \Delta t) \\ c_1(t + \Delta t) \\ . \\ . \\ . \\ c_\ell(t + \Delta t) \\ c_{\ell+1}(t + \Delta t)\end{bmatrix} = \begin{bmatrix} g_0(t) \\ g_1(t) \\ . \\ . \\ . \\ g_\ell(t) \\ g_{\ell+1}(t)\end{bmatrix}

where g_i(t) = (1 - \theta)Ac_{i-1}(t) + B^{\ast\ast}c_i(t) + (1 - \theta)Cc_{i+1}(t) for 1 \le g \le \ell and g_0(t) and g_{\ell+1}(t) are boundary conditions.

Removing the borders because they are known,

\begin{bmatrix} B^\ast & -\theta C& {}& {}& {} \\ -\theta A & B^\ast & -\theta C& {}& {} \\ {} & . & . & . & {} \\ {} & {} & -\theta A & B^\ast & -\theta C \\ {} & {} & {} & -\theta A & B^\ast\end{bmatrix}\begin{bmatrix} c_1(t + \Delta t) \\ c_2(t + \Delta t) \\ . \\ c_{\ell-1}(t + \Delta t) \\ c_\ell(t + \Delta t)\end{bmatrix} = \begin{bmatrix} g_1(t) + \theta Ac_0(t + \Delta t) \\ g_2(t) \\ . \\ g_{\ell-1}(t) \\ g_\ell(t) + \theta Cc_{\ell+1}(t + \Delta t)\end{bmatrix}

#### 3. Suitability of the Thomas Algorithm

The LAPACK subroutine dgtsv uses Gaussian elimination since the Thomas algorithm is conditionally stable. Gaussian elimination, however, is O(n^3), whereas the Thomas algorithm is O(n).

The Thomas algorithm for this problem is

\begin{aligned} C'_i(t) &= \begin{cases} \frac{-\theta C}{B^\ast} &;~ i = 1 \\ \frac{-\theta C}{B^\ast + \theta AC'_{i-1}(t)} &;~ i = 2,3,\ldots,\ell-1 \end{cases} \\ G'_i(t) &= \begin{cases} \frac{g_1(t) + \theta Ac_0(t + \Delta t)}{B^\ast} &;~ i =1\\ \frac{g_i(t) + \theta AG'_{i-1}}{B^\ast + \theta AC'_{i-1}(t)} &;~ i = 2,3,\ldots,\ell-1 \\ \frac{g_\ell(t) + \theta Cc_{\ell+1}(t + \Delta t) + \theta AG'_{\ell-1}}{B^\ast + \theta AC'_{\ell-1}(t)} &;~ i = \ell \end{cases} \\ c_\ell(t + \Delta t) &= G'_\ell(t) \\ c_i(t + \Delta t) &= G'_i(t) - C'_i(t)c_{i+1}(t + \Delta t) ~~~~;~ i = \ell-1,\ell-2,\ldots,1\end{aligned}

The Thomas algorithm is stable when the matrix is diagonally dominant.

\begin{aligned} B^\ast &\ge \theta(A + C) \\ 1 + 2\theta\frac{D\Delta t}{(\Delta x)^2} &\ge 2\theta\frac{D\Delta t}{(\Delta x)^2} \\ 1 &\ge 0~~ \checkmark\end{aligned}

Thus the Thomas algorithm is stable for this problem.

#### 4. Environment Parameters

The parameters of the environment are (note the addition of t_\text{plume}):

{Variable Declarations 2:2}
real, codimension[*] :: ca, cb, cc
real, codimension[*] :: length, v, d
real, codimension[*] :: t_plume


Added to in sections 5, 6, 8, 9, 10, 11, 13, 2:2 and 2:4

Used in sections 16 and 2:6

where (c(x,t) = c_{x/dx}(t),~ x/dx \in \mathbb Z. dx is declared in the next section.):

VariableUnitsDefinition
camg/Lc_a = c(x,0),~ 0 \le x \le L
cbmg/Lc_b = c(0,t),~ t > 0
ccmg/Lc_c = c(L,t),~ t > 0
lengthmetersL is the length to model.
vm/dayv is the average linear flow velocity.
dm^2/dayD is the dispersion coefficient.
t_plumedayst_\text{plume} is the length of the plume; forever if < 0.

These parameters are read in as two records of three values each:

{Input 2:2}
if (this_image() == 1) then
end if


Added to in sections 5 and 2:2

Used in sections 16 and 2:6

#### 5. Control Parameters

The program also takes control parameters (note the addition of \theta):

{Variable Declarations 2:2} +=
real, codimension[*] :: dx, dt, theta
integer, codimension[*] :: tout_length
! The dimensions of tout_array will be set later, at allocation.
real, dimension(:), codimension[:], allocatable :: tout_array


Added to in sections 6, 8, 9, 10, 11, 13, 2:2 and 2:4

Used in sections 16 and 2:6

where:

VariableUnitsDefinition
dxmetersdx = \Delta x, the spatial discretization.
dtdaysdt = \Delta t, the temporal discretization.
theta\theta is the model's implicitness.
tout_lengthLength of tout_array
tout_arraydaysTimes at which to output

t_\text{max} is the last element of tout_array.

These parameters are read in as three records. The first is three reals, the second is just tout_length (an integer), and the third is tout_length reals:

{Input 2:2} +=
if (this_image() == 1) then
end if
{Allocate Output Times, 6}
if (this_image() == 1) then
end if


Used in sections 16 and 2:6

#### 6. Output Times Allocation

All images have to allocate the array at the same time:

{Allocate Output Times 6}
call co_broadcast(tout_length, source_image=1)
allocate (tout_array(tout_length) [*], stat=alloc_stat, errmsg=alloc_errmsg)
{Handle Allocation Error, 6}


Used in section 5

Handling allocation errors necessitates two more variables:

{Variable Declarations 2:2} +=
integer :: alloc_stat
character(len=80) :: alloc_errmsg


Added to in sections 5, 8, 9, 10, 11, 13, 2:2 and 2:4

Used in sections 16 and 2:6

If an error does occur, output the error message to standard error and exit:

{Handle Allocation Error 6}
if (alloc_stat > 0) then
write (unit=error_unit, fmt="(a)") trim(alloc_errmsg)
stop
end if


Used in sections 8 and 8

To output to standard error, the error_unit number is needed:

{Use Statements 6}
use, intrinsic :: iso_fortran_env, only: error_unit


Used in section 16

This broadcasts all the parameters (except tout_length, which has already been broadcast) from the first image (which reads the values) to all the others:

call co_broadcast(ca, source_image=1)


Used in section 16

#### 8. Splitting the Domain

This section and the following four (through "Simulating to a Particular Time Step" are unchanged from forward FD.

For one image to handle the full length L with discretization \Delta x, it would need an array of size L/\Delta x + 1 to represent the concentrations (including the boundaries). We will assume L and \Delta x are such that their quotient is an integer. To describe the splitting method among n images, I will use an example (L = 14,~ \Delta x = 1,~ n = 3):

|BB   image 1   BB|     |BB    image 3     BB|
00 01 02 03 04 05 06 07 08 09 10 11 12 13 14
|BB   image 2   BB|


BB indicates that the corresponding element is treated as a boundary condition for that image. We essentially have here three separate models. The only communication between the models is that at each time step, the boundary values of each model are updated to the corresponding "active" values in the image's neighbors. Each image 1 through n - 1 needs an array of size \lfloor L/(n\Delta x) \rfloor + 2, and image n needs an array of size (L/\Delta x + 1) - (n-1)\lfloor L/(n\Delta x) \rfloor. In light of this:

{Variable Declarations 2:2} +=
real, dimension(:), codimension[:], allocatable :: concentrations
real, dimension(:), allocatable :: next_concentrations
integer, codimension[*] :: next_concentrations_length
! Intermediate values
real :: num_edges_real
integer :: num_edges


Added to in sections 5, 6, 9, 10, 11, 13, 2:2 and 2:4

Used in sections 16 and 2:6

Before initializing the domain arrays, the program checks that L/\Delta x is an integer greater than 0. If it is not, it stops:

{Initialize Domain 8}
num_edges_real = length / dx
num_edges = nint(num_edges_real)
if (abs(num_edges_real - num_edges) > 8*epsilon(num_edges_real) .or. &
num_edges < 1) then
write (unit=error_unit, fmt="(a)") "ERROR: L/dx must be an integer >= 1."
stop
end if


Used in section 16

Then the size of the next_concentrations array is calculated (2 less than concentrations, since it excludes the boundaries. Note that the division of positive integers is equivalent to floor of division:

{Initialize Domain 8} +=
if (this_image() == num_images()) then
next_concentrations_length = (num_edges - 1) - &
(num_images()-1)*(num_edges / num_images())
else
next_concentrations_length = (num_edges / num_images())
end if


Used in section 16

Finally, the arrays are allocated and initialized. Note the difference in starting index. This is so that the indexes directly correspond:

{Initialize Domain 8} +=
allocate (concentrations(0:next_concentrations_length + 1) [*], &
stat=alloc_stat, errmsg=alloc_errmsg)
{Handle Allocation Error, 6}
concentrations = ca
allocate (next_concentrations(1:next_concentrations_length), &
stat=alloc_stat, errmsg=alloc_errmsg)
{Handle Allocation Error, 6}


Used in section 16

{Variable Declarations 2:2} +=
integer :: xi


Added to in sections 5, 6, 8, 10, 11, 13, 2:2 and 2:4

Used in sections 16 and 2:6

This just outputs the header for the value table:

if (this_image() == 1) then
call writereal(0.0)
do xi = 0, num_edges
call writereal(xi * dx)
end do
end if


Used in section 16

#### 10. Outputting Reals

This just makes writing reals a bit less wordy. For information on the output format, see
https://tachibanatech.com/litdoc/hw3.full.d/_book/a_heat_conduct.html#1:5

{Variable Declarations 2:2} +=
character(len=13), parameter :: real_fmtstr = "(sp,es16.7e3)"


Added to in sections 5, 6, 8, 9, 11, 13, 2:2 and 2:4

Used in sections 16 and 2:6

{Real Output Subroutine 10}
subroutine writereal(r)
implicit none
real :: r
end subroutine writereal


Used in section 16

#### 11. Main Loop

{Variable Declarations 2:2} +=
integer :: ti, tout_i, image_i


Added to in sections 5, 6, 8, 9, 10, 13, 2:2 and 2:4

Used in sections 16 and 2:6

The main loop basically just "fast-forwards" to each requested output time, then prints a record of the concentrations:

{Main Loop 11}
!write (unit=error_unit, fmt=*) this_image(), next_concentrations_length
ti = 0
do tout_i = 1, size(tout_array)
call run_to(nint(tout_array(tout_i) / dt))
if (this_image() == 1) then
{Print Record, 11}
end if
end do


Used in section 16

This outputs the concentration at each node:

{Print Record 11}
call writereal(ti * dt)
call writereal(concentrations(0)[1])
do image_i = 1, num_images()
do xi = 1, next_concentrations_length[image_i]
call writereal(concentrations(xi)[image_i])
end do
end do
xi = next_concentrations_length[num_images()] + 1
call writereal(concentrations(xi)[num_images()])


#### 12. Simulating to a Particular Time Step

This subroutine just calls time_step and writeback until it has reached a particular time. Note that on entry to time_step, ti should be the index of the time already calculated. time_step calculates the values for time ti + 1.

{Run To Subroutine 12}
subroutine run_to(new_ti)
implicit none
integer, intent(in) :: new_ti

if (new_ti <= ti) then
return
end if
do ti = ti, new_ti - 1
call time_step
call writeback
end do
ti = new_ti
end subroutine run_to


Used in section 16

#### 13. Calculating Coefficients

Note that this is executed before the main loop starts.

{Variable Declarations 2:2} +=
real :: coeff_p, coeff_r
real :: coeff_as, coeff_ass
real :: coeff_bs, coeff_bss
real :: coeff_cs, coeff_css


Added to in sections 5, 6, 8, 9, 10, 11, 2:2 and 2:4

Used in sections 16 and 2:6

These are the coefficients used for calculating time steps ($A^\ast, A^{\ast\ast}, B^\ast, B^{\ast\ast}, C^\ast, C^{\ast\ast}$)

{Calculate Coefficients 13}
coeff_p = (d*dt) / (dx*dx)
coeff_r = (v*dt) / (2*dx)
coeff_as = -theta * (coeff_p + coeff_r)
coeff_ass = (1-theta) * (coeff_p + coeff_r)
coeff_bs = 1 + 2*theta*coeff_p
coeff_bss = 1 + 2*(theta-1)*coeff_p
coeff_cs = -theta * (coeff_p - coeff_r)
coeff_css = (1-theta) * (coeff_p - coeff_r)


Used in section 16

#### 14. Stepping through Time

This subroutine calculates concentrations at time ti + 1. Note that it does not update ti: this is the responsibility of the caller. Neither does it copy the values it puts in next_concentrations back into concentrations. For this copying, use writeback.

The equations for this step are (see "Equations" and "Suitability of the Thomas Algorithm")

\begin{aligned} c_\ell(t + \Delta t) &= G'_\ell(t) \\ c_i(t + \Delta t) &= G'_i(t) - C'_i(t)x_{i+1} ~~~~;~ i = \ell-1,\ell-2,\ldots,1\end{aligned}

{Time Step Subroutine 14}
subroutine time_step
implicit none
integer :: i
real, dimension(1:next_concentrations_length - 1) :: c_prime, g_prime
real :: next_cb, next_cc

! Populate C' and G'
c_prime(1) = coeff_cs / coeff_bs
g_prime(1) = (coeff_ass*concentrations(0) + coeff_bss*concentrations(1) + &
coeff_css*concentrations(2) - coeff_as*next_cb) / coeff_bs
do i = 2, next_concentrations_length - 1
c_prime(i) = coeff_cs / (coeff_bs - coeff_as*c_prime(i-1))
g_prime(i) = (coeff_ass*concentrations(i-1) + &
coeff_bss*concentrations(i)+coeff_css*concentrations(i+1) &
- coeff_as*g_prime(i-1)) / (coeff_bs-coeff_as*c_prime(i-1))
end do

! Back-substitute to solve
i = next_concentrations_length
next_concentrations(i) = (coeff_ass*concentrations(i-1) + &
coeff_bss*concentrations(i) + coeff_css*concentrations(i+1) - &
coeff_cs*next_cc + coeff_as*g_prime(i-1)) / (coeff_bs - &
coeff_as*c_prime(i-1))
do i = (next_concentrations_length - 1), 1, -1
next_concentrations(i) = g_prime(i)-c_prime(i)*next_concentrations(i+1)
end do
end subroutine time_step


Used in section 16

After the plume is over,

if (t_plume > 0 .and. (ti + 1) * dt > t_plume) then
next_cb = 0
next_cc = 0
else
next_cb = cb
next_cc = cc
end if


Used in section 15

#### 15. Write Back Concentrations

This subroutine copies next_concentrations back into concentrations and also syncs the boundaries for each image.

{Writeback Subroutine 15}
subroutine writeback
implicit none
real :: next_cb, next_cc
integer :: i
do i = 1, next_concentrations_length
concentrations(i) = next_concentrations(i)
end do

{Sync Boundaries, 15}
end subroutine writeback


Used in section 16

To sync boundaries, the "left" boundary of each image is set to the last "active" concentration in the previous image (except for on the first image, on which the "left" boundary is a real boundary condition), and the "right" boundary is set to the first "active" concentration in the next image (except for on the lat image, on which the "right" boundary is a real boundary condition).

{Sync Boundaries 15}
sync all
if (this_image() == 1) then
concentrations(0) = next_cb
else
concentrations(0) = concentrations( &
next_concentrations_length[this_image() - 1])[this_image() - 1]
end if
if (this_image() == num_images()) then
concentrations(next_concentrations_length + 1) = next_cc
else
concentrations(next_concentrations_length + 1) = &
concentrations(1)[this_image() + 1]
end if
sync all


#### 16. File Outline

! **WARNING: THIS CODE DOES NOT CURRENTLY WORK WITH > 1 IMAGE.**
! Planning to fix this and do part 3 over the weekend.

{Use Statements, 6}

implicit none
{Variable Declarations, 2:2}

{Input, 2:2}
sync all

{Initialize Domain, 8}

{Calculate Coefficients, 13}

{Main Loop, 11}

contains

{Real Output Subroutine, 10}
{Run To Subroutine, 12}
{Time Step Subroutine, 14}
{Writeback Subroutine, 15}