Mathematical Modeling

Introduction and Applications in Geology

Shaumik Daityari / @ds_mik
http://sdaityari.github.io/mathematical.modeling.slides

What is this all about?

  • Representing a system in terms of the language and concepts of mathematics
  • Statistical Models, Differential Equations, Dynamic Programming, Game Theory
  • Tic-Tac-Toe machine player
  • Weather Predictions

The Process

  • Understand the problem
  • Formulate the problem
  • Design an algorithm
  • Implement the algorithm
  • Run the program - did it solve the problem?

Understand the problem

  • Description of the problem
  • The inputs that would be provided
  • The expected output
  • Describe the problem on paper
  • Solve a few test cases
  • Decide boundary cases

Formulate the problem

  • Translating from mathematical language to a computer language
  • How will you store the input
  • How will you convert the input and arrive at the output
  • How will you display and present the output

Design an algorithm

  • Decide algorithm design (like Divide and Conquer, Greedy)
  • Decide data structures to be used
  • Analyze the efficiency of the algorithm - in terms of time and space
  • If an algorithm is difficult to implement, what approximations do we use?
  • Correctness - do all test cases pass?

Implement the algorithm

  • Translate the algorithm to a program
  • Select and code in a computer language
  • Algorithm may be efficient, but its implementation may be highly inefficient

Run the program - did it solve the problem?

  • Do test cases pass?
  • Are boundary conditions satisfied?
  • Is it as efficient as expected?
  • Does it solve the original problem?

Kinematic Analytics of Viscous Flows through Confined Narrow Zones

Understand the problem

  • In a deformation zone, there are three types of stresses that act on a control volume
  • A lateral pressure
  • Wall perpendicular compression
  • Wall parallel shearing

Boundary Conditions

Formulate the problem

  • Using the Navier Stokes’ equations, we arrive at the final equation
  • Important assumption - the constants are time independent
  • Consider a material point inside the control volume and iterate it over small time periods
  • Trace the movement or record the final position of this material point

Design an algorithm

  • Fairly simple
  • To trace movement of material points, trace the change in positions as you iterate over time.
  • For strain ellipses, create circles and trave the movement of each point on the circle
  • Plot these on a graph representing a cross section of the intial control volume

Implement the algorithm for trace

                            
# Assume an initial point p
p = (x, y)
dt = 1 year
trace = []
for i in range(1000):
    p[0] += u(at x = p[0], y = p[1]) * dt
    p[1] += v(at x = p[0], y = p[1]) * dt
    trace.append(p)
# plot trace
                            
                        

Implement the algorithm for strain ellipse

                            
# For a circle with center (h, k) and radius r
center = (h, k)
circumference = [(h + r * cos(θ), k + r * sin (θ)) for θ in range(360)]
dt = 1 year
strain_ellipse = []
for point in circumference:
    for i in range(1000):
        point[0] += u(at x = point[0], y = point[1]) * dt
        point[1] += v(at x = point[0], y = point[1]) * dt
    strain_ellipse.append(point) # Appending the final state of material point
                            
                        

Run the program - did it solve the problem?

Pattern Recognition in Remote Sensing Images

Understand the problem

  • Images that we receive from satellites are raster images; they consist of pixels
  • Pixel is a picture element, which contains a color value and a brightness value
  • Patterns are a group of similar pixels
  • Essentially clustering of pixels which are similar to each other

Formulate the problem

  • Two ways of clustering points
  • Hierarchical Clustering
  • K-means Clustering
  • Use either of them to cluster pixels

Hierarchical Clustering

  • Input: A set P of points, and a number of clusters k
  • Initialization: Put each point in a cluster by itself
  • Find two closest clusters and merge them into one
  • Repeat until k clusters
  • How do we find the two closest clusters?

K-means Clustering

  • Input: A set P of points, a number of clusters k, and a number of iterations q
  • Initialize k centers
  • Initialize k empty clusters
  • For each point, add it to the cluster closest
  • Recompute the cluster's center
  • Repeat q times

Implement the algorithm

Defining a Cluster class

                            
class Cluster():
    def __init__(self, points, center = None):
        self.points = points
        self.center = ... #Compute the center from the list of points
        # or use initial value provided for empty cluster

    def distance(self, Cluster):
        # compute distance of one cluster from another cluster

    def merge_clusters(new_cluster):
        # merge two clusters and return the merged cluster

                            
                        

Implement the algorithm

Hierarchical Clustering

                            
def hierarchical_clustering(cluster_list, num_clusters):
    """
    Compute a hierarchical clustering of a set of clusters
    Note: the function mutates cluster_list

    Input: List of clusters, number of clusters
    Output: List of clusters whose length is num_clusters
    """
    new_cluster_list = cluster_list[:]

    while len(new_cluster_list) > num_clusters:
        _, node1, node2 = fast_closest_pair(new_cluster_list)
        new_cluster_list[node1].merge_clusters(new_cluster_list[node2])
        del new_cluster_list[node2]

    return new_cluster_list

                            
                        

Implement the algorithm

k-means Clustering

                            
def kmeans_clustering(cluster_list, num_clusters, num_iterations):
    cluster_n = len(cluster_list)

    miu_k = sorted(cluster_list,
              key=lambda c: c.total_population())[-num_clusters:]
    miu_k = [c.copy() for c in miu_k]

    # n: cluster_n
    # q: num_iterations
    for _ in xrange(num_iterations):
        cluster_result = [alg_cluster.Cluster(set([]), 0, 0, 0, 0) for _ in range(num_clusters)]
        # put the node into closet center node

        for jjj in xrange(cluster_n):
            min_num_k = 0
            min_dist_k = float('inf')
            for num_k in xrange(len(miu_k)):
                dist = cluster_list[jjj].distance(miu_k[num_k])
                if dist < min_dist_k:
                    min_dist_k = dist
                    min_num_k = num_k

            cluster_result[min_num_k].merge_clusters(cluster_list[jjj])

        # re-computer its center node
        for kkk in xrange(len(miu_k)):
            miu_k[kkk] = cluster_result[kkk]

    return cluster_result
                            
                        

Run the program - did it solve the problem?

Results of sequential K-Means, where K is two. (a) is the original image. (b) is one of its clusters and (c) is another cluster. Source: Zhenhua, 2010

Final Thoughts

  • Mathematical modeling is an essential tool
  • Algorithms are at the heart of Mathematical Modeling
  • Algorithm design and implementation is very important
  • You can't come up with program that take 10 years to run
  • Different Approaches give different results
  • k-means clustering gives jagged edges as compared to hierarchical clustering, which is smoother
  • Know your algorithms, solve problems with ease

References

  • Nakhleh, Luay, Scott Rixner, Joe Warren, "Algorithmic Thinking", Rice University (Coursera) 2014, https://class.coursera.org/algorithmicthink-001
  • Lv, Zhenhua, et al. "Parallel K-means clustering of remote sensing images based on mapreduce." Web Information Systems and Mining. Springer Berlin Heidelberg, 2010. 162-170.
  • Beaumont, C., et al. "Himalayan tectonics explained by extrusion of a low-viscosity crustal channel coupled to focused surface denudation." Nature 414.6865 (2001): 738-742.
  • Mandal, Nibir, Chandan Chakraborty, and Susanta Kumar Samanta. "Flattening in shear zones under constant volume: a theoretical evaluation." Journal of Structural Geology 23.11 (2001): 1771-1780.

THE END

XKCD Python Comic

Any questions?