Skip to content

Advertisement

  • Research
  • Open Access

A geometric method for contour extraction of Drosophila embryos

BMC Systems BiologyBMC series – open, inclusive and trusted201711 (Suppl 6) :102

https://doi.org/10.1186/s12918-017-0478-1

  • Published:

Abstract

Background

High resolution images of Drosophila embryos in their developmental stages contain rich spatial and temporal information of gene expression. Automatic extraction of the contour of an embryo of interest in an embryonic image is a critical step of a computational system used to discover gene-gene interaction on Drosophila.

Results

We propose a geometric method for contour extraction of Drosophila embryos. The key of the proposed geometric method is k-dominant point extraction that is a generalization of 3-dominant point extraction proposed in our previous work. Based on k-dominant point extraction, we can approximate a connected component of edge pixels by a polygon that can be either convex or concave. The test on BDGP data shows that the proposed method outputforms two existing methods designed for contour extraction of Drosophila embryos.

Conclusions

The main advantage of the proposed geometric method in the context of contour extraction of Drosophila embryos is its ability of segmenting embryos touching each other. The proposed geometric method can also be applied to applications relevant to contour extraction.

Keywords

  • Drosophila embryo
  • Contour extraction
  • Concave shape
  • Dominant point
  • Geometric sequence

Background

High resolution images of Drosophila embryos in their developmental stages contain rich spatial and temporal information of gene expression. They have become a valuable instrument for micro-biologists to discover gene-gene interaction [1]. Automatic extraction of the contour of an interest embryo in an image is a critical step of a computational system for the discovery of gene-gene interaction on Drosophila [2].

In general, Drosophila embryonic images contain substantial amount of variations [3, 4]: i) imaging conditions, such as contrasts, scale, orientation, and neighboring embryos, ii) gene expression patterns, and iii) developmental stages. Most existing methods were developed upon low-level image features, such as edge pixels or pixels with a high deviation of grayvalues in a local window [310]. Peng and Myers [5] proposed a method that uses the standard deviation of the local windows of a pixel to classify the pixel as a foreground or background pixel. Their method applies a 8-neighbor-connectivity region-growing method to extract the contour of an embryo. Pan et al. [6] applied a variant of Marquardt-Levenberg algorithm to estimate an optimal affine transformation to register localized embryos into an ellipsoidal region. Puniyani et al. [7] proposed an edge detection based method that assumes a number of heuristic constraints, including object size, convexity, shape features (e.g., ratio of the major over minor axis of an object), and the percentage of overlapping regions. Frise et al. [8] developed the method of Peng and Myers [5] by adding three morphological operations on a binary image: i) removal of isolated pixels, ii) dilation, and iii) majority processing. Futhermore, Frise et al. [8] proposed a heuristic algorithm to separate the embryo of interest from multiple touching embryos, with the assumption that the center of the embryo of interest is the image center. Mace et al. [9] proposed an eigen-embryo method to extract the contour of embryos, where a particle swarm optimizer was used to reduce the computational cost of searching optimal eigen parameters. Li and Kambhamettu [3] proposed a quadratic curve model to initialize the contour of the embryo of interest based on edge pixels, and applied an active contour model to refine embryo contours. Bessinger et al. [10] proposed criteria to select the optimal connected component of edge pixels in the scale space of an input image. Li [4] proposed algorithms to detect and restore deficiencies and faults of primal sketch tokens that occur when a targeting object is surrounded by a complex background.

In this paper, we propose a geometric method for contour extraction of Drosophila embryos, and the key of the proposed method is k-dominant point extraction. In the context of rectangular shape detection [11], we have proposed 3-dominant point extraction that can be used to analyze the geometric structure of licence plates. Note that the contour of a license plate is relatively simple—first of all, it is piecewise linear; second, it is convex. We propose to generalize 3-dominant point extraction to k-dominant point extraction in order to adapt to the complex geometric structure of the contour of a Drosophila embryo. The complexity of these geometric structures mainly lies in the following two aspects: i) the contour of a Drosophila embryo may be concave and ii) the contour of two Drosophila embryos that touch to each other may be concave (see Fig. 1). The proposed method is able to segment embryos touching to each other. Note that many methods on contour extraction, including an active contour model, can not segment two objects that touch to each other. The proposed geometric method can also be applied to other tasks relevant to contour extraction.
Fig. 1
Fig. 1

Two circumstances for the introduction of k-dominant points. a the contour of a Drosophila embryo forms a concave shape, and b the contour of two Drosophila embryos that touch to each other forms a concave shape

Methods

3-dominant point extraction

Given a set of points, such as a connected component of edge pixels, Li et al. [12] proposed a recursive method to extract three dominant points v 1,v 2 and v 3. The first two points (v 1 and v 2) maximize the Euclidean distance of an arbitrary pair of points in C, and the third point v 3 maximizes the sum of distances between a pC and v i ,i=1,2, i.e.,
$$ \max_{p\in C} \left(\left\| p-v_{1} \right\| + \left\| p - v_{2} \right\|\right). $$
(1)

Given a point and a line segment v 1 v 2, we call pv 1+pv 2 a point-line-segment distance to distinguish the common point-line distance that is defined by the distance between p and its vertical intersection with a line passing v 1 and v 2.

Based on v 1,v 2, and v 3, the piecewise linearity of C is then verified, i.e., where each point pC falls on either the line segment v 1 v 3 or v 2 v 3. If yes, the recursion is stop. Otherwise, C is partitioned into two subsets, and the above 3-dominant point extraction method is then applied to the two subsets recursively.

K-dominant point extraction: basic concepts

In this section, we first generalize the approach for locating the 3rd dominant point, given two dominant points, from a set of unordered points P to a formula for locating the i-th dominant point, given i number of dominant points, from P. Then, we propose a simple method to insert a new dominant point into a sequence of geometrically-ordered dominant points so that dominant points are ordered geometrically. Last, we proposed a solution to address the challenge of concave polygons.

The basic idea of k-dominant point extraction is to iteratively insert a new dominant point into a set of k−1 dominant points that have been found, under certain geometric constraint. For the convenience of illustration, we now introduce several basic concepts. First of all, the k−1 dominant points are expected to be a geometric sequence that is consistent with a given set of 2D points.

Denote 〈r(1),…r(k)〉 is a permutation of 1,…,k. A geometric sequence of k−1 dominant points is denoted as S k−1=〈v r(1),v r(2),…,v r(k−1)〉. A valid geometric sequence is expected to be a polyline, i.e., v r(1) v r(2) is the first line segment passing a subset of points, and v r(2) v r(3) is the second line segment passing a subset of points, etc. The initial geometric sequence contains two dominant points, ideally representing a line segment (also called 1-piece polyline).

A closed tag is introduced with respect to a consecutive pair of dominant points (v r(i),v r(i+1)) in a geometric sequence with the motivation of speeding up the insertion of a new dominant point. A consecutive pair with (v r(i),v r(i+1)) a closed tag indicates that a new dominant point is not allowed to be inserted between v r(i) and v r(i+1) in the associated geometric sequence.

Insertability is introduced with respect to a new dominant point in order to tell whether the new dominant point is allowed to insert to a geometric sequence of dominant points. Insertability of a point is essentially introduced as a condition to stop the “global” search of dominant points. Imagine that we have a set of points forming a rectangle. After we find out four dominant points associated with the four vertices of a rectangle, the insertability of the fifth dominant point is expected to be NO in order to avoid inserting a non-vertex point into the sequence. Given a geometric sequence S=〈v 1,v 2,…,v k−1〉 and a point v k , the point v k is called (S,ε)-insertableif the point-line-segment distance between v k to every pair (v i ,v i+1) is less than or equal to (1+ε) times the length of the line segment v i v i+1, i.e.,
$$ \left\|v_{k} -v_{i}\right\| + \left\|v_{k} - v_{i+1}\right\| \le (1+\epsilon) \left\|v_{i}-v_{i+1}\right\|, \quad \forall i, $$
(2)

where ε is a parameter to tolerate the distortion of a straight line. ε is not a sensitive parameter, and it can be set from 0.01 to 0.05. In this paper, we fix it to be 0.02. Thus, we sometime simply call a point v k S-insertable, or just insertable.

Initialization

Similar to 3-dominant point extraction, k-dominant point extraction starts from locating two points v 1 and v 2, given a point set, such that their Euclidean distance is a maximal distance among distances of all pairs of points in P, i.e.,
$$ \left(v_{1}, v_{2}\right) = \text{argmax}_{p1 \in P, p2 \in P} \left\| p1 - p2\right\|. $$
(3)

The closed tags for v 1 and v 2 are both initialized as 0 (i.e., false).

Searching a new dominant point

Given a set of points P and k−1 dominant points v 1,…,v k−1, k>2, we propose the following formula to search the k-th dominant point:
$$ \max_{p\in P} \sum^{k-1}_{i=1} \left\| p-v_{i} \right\|. $$
(4)

For k=3, we have an initial geometry sequence S 2=〈v 1,v 2〉. Based on Eq. 2, we can test the insertability of the new dominant point v 3. For k>3, we can assume that a geometry sequence S k−1 has been iteratively built, as described in the following section.

It is worth noting that the time complexity of searching a new dominant point depends on the point set P, i.e., Θ(|P|). However, testing the insertability of the new dominant point depends on the geometry sequence S only, i.e., in the cost of Θ(|S|). With closed tags, the computational cost can be further reduced.

Upon an insertable dominant point: growing S k−1

Given an S-insertable dominant point, we will insert the point into an open “slot” of the geometric sequence S. Given a sequence S k−1 of k−1 dominant points with a geometric order, i.e., S k−1=〈v r(1),v r(2),…,v r(k−1)〉, each open pair (v r(i),v r(i+1)) of S offers a space for an insertable v k to insert, as follows:
$$ S_{k-1,i} = \left\langle v_{r(1)}, \ldots, v_{r(i)}, \Box, v_{r(i+1)}, \ldots, v_{r(k-1)} \right\rangle, $$
(5)

where □ indicates a possible space where v k may be inserted. Here the notation S k−1,i represents an abstract geometric sequence that contains a placeholder □ between the pair (v i ,v i+1). Furthermore, we denote 〈S k−1,i ,v k 〉 a concrete geometric sequence by replacing the placeholder □ in the abstract sequence S k−1,i by v k .

For convenience, we introduce r(k)=r(1) and augment the sequence 〈v r(1),v r(2),…, v r(k−1)〉 to 〈v r(1),v r(2),…, v r(k−1),v r(k)〉.

We propose two criteria to measure the confidence of a sequence of dominant points S k =〈v r(1),v r(2),…,v r(k)〉. The first one is:
$$ conf(S_{k}) = \left| \left\{ p \in C: p \in v_{r(i)}v_{r(i+1)} \right\}\right|, $$
(6)

where · denotes the cardinality of a set.

The second criterion is based on overall length of the polyline formed by the sequence S k , i.e.,
$$ conf(S_{k}) = \sum^{k}_{i=1} \left\|v_{r(i)}-v_{r(i+1)}\right\|. $$
(7)

The second criterion can be applied if C forms a simple curve (no self intersection). It is also easy to see that the computational cost of the second type of confidence is much lower than the first type.

By maximizing the confidence of each derived sequence, we can decide the optimal insertion of a new dominant point in order to maintain the geometric order.

Figure 2 illustrates an example of assigning a geometric order of k dominant points. Given the first dominant points v 1 and v 2. Without lose of generality, we start from the geometric sequence 〈v 1,v 2〉. After v 3 is computed by Eq. 4, there are two possible options to insert v 3: i) 〈v 1,□,v 2〉 and 〈v 1,v 2,□〉. The first option brings us the sequence 〈v 1,v 3,v 2〉, where closed tag assignment is (1,1,0), meaning that v 1 v 3 and v 3 v 2 are closed, and v 2 v 1 is open. The second option brings us the sequence 〈v 1,v 2,v 3〉, where closed tag assignment is (0,1,1), meaning that v 1 v 2 is open, and v 2 v 3,v 3 v 1 are both closed. Based on the confidence measure, these two options are both optimal. Without lose of generality, we choose the first option for following illustration. Consecutively, the proposed method grows the geometric sequence as follows: i) 〈v 1,v 3,v 2〉 with closed tags (1,1,0); ii) 〈v 1,v 3,v 2,v 4〉 with closed tags (1,1,1,0); iii) 〈v 1,v 3,v 2,v 4,v 5〉 with closed tags (1,1,1,0,1).
Fig. 2
Fig. 2

Illutration of open and closed tags. Open and closed tages on a consecutive pair of dominant points in a geometric sequence of dominant points

Upon a non-insertable dominant point: reduce P

If a new dominant point v i computed by Eq. 4 is non-insertable, we will stop growing the geometric sequence and start to reduce the input point set P. The basic idea of reducing P is to remove all points that lie in one of closed line segments such that we obtain a subset of points that have a simpler topology. A point remained in P must be associated with a certain open pair of dominant points. So the above method can be recursively applied to each subset of points, and in turn each output (a sequence of dominant points from a subset of points can be correctly inserted into a higher level output). Simply speaking, a non-insertable dominant point activates a divide-and-conquer strategy that can handle the concavity of a data set shaped by a concave polygon.

Figure 3 illustrates the basic idea on how to reduce P when a new dominant point is tested to be non-insertable. After reduction, there are two subsets of points since there are two open pairs of dominant points in the geometric sequence. K-dominant point extraction method is then recursively applied to these two point sets, respectively.
Fig. 3
Fig. 3

Reduction of a point set. Reducing the input point set P if a new dominant point (that is some point in one of four closed line segments) is not insertable. A red dot represents an “old” dominant point, a circle represents a point removed from P, and a black dot represents a point in \(\bar {P}\) (point set after reduction)

Algorithm 1 summarizes the procedure of testing the closedness of a pair of dominant points (v,w), given a point set P. Note that if closedness is true, P will be updated by removing all points near the line segment vw. The time complexity of the algorithm is dominated by Step 5 (sorting) that is O(|P| log(|P|)). Algorithm 2 summarizes the recursive implementation for k-dominant point extraction. The time complexity of the algorithm is dominated by two non-recursive steps: i) Step 4 (the initialization of S) that is O(|P|2), and ii) Step 13 (closedness) that is O(|P| log(|P|), in addition to the recursive step, i.e., Step 16. In many cases, the concavity is not very complicated and the depth of recursion won’t be over 2. Therefore, the overall time complexity of the Algorithm is O(|P|2(1+ log(|P|)2)) with an assumption that the number of vertices of a polygon is a constant and points in P can be uniformly projected to S.

Limitation of max-sum-distance measure

Figure 4 shows three scenarios of point sets after the first three dominant points have been located. Figure 4 a shows a scenario where the fourth dominant point is non-insertible with respect to the sequence 〈v 1,v 2,v 3〉, however a point set is non-reducible. More specifically, given a point set (a number of black dots) as illustrated in the figure, assume that the first three dominant points have already been located according to the proposed kdp method. Since the max function in Eq. 4 is a convex function, the fourth dominant point is expected to be near to the boundary of the convex hull of the point set (which is equivalent to the triangle with vertices v 1,v 2 and v 3). In other words, the four dominant point must be a point nearest to either v 1,v 2 or v 3. Based on the definition of “insertible”, the fourth point is non-insertible with respect to the geomtric sequence v 1,v 2 and v 3. According to the above algorithm, we now stop growing the sequence, and start to reduce the given point set P. However, it turns out that P is not reducible. This scenario shows a limitation of using max-sum-distance in computing k-dominant points. Note that this scenario is a representative formuation of two objects who have smooth boundaries and are touching to each other.
Fig. 4
Fig. 4

Different scenarios on dominant points and point sets. a A scenario where the fourth dominant point is non-insertible, and the point set is non-reducible with respect to v i ,i=1,…,3; b A scenario where the fourth dominant point is non-insertible, and the point set is reducible with respect to v i ,i=1,…,3; c A scenario where the fourth dominant point is insertible with respect to v i ,i=1,…,3, and the algorithm will continue to locate the fifth dominant point

Figure 4 b shows a scenario where the fourth dominant point is non-insertible and the point set is reducible with respect to v i ,i=1,…,3. The fourth dominant point is expected to be a point on the line segment v 2 v 3. Precisely speaking, the fourth dominant point is the point nearest to v 2 as v 2v 1>v 3v 1. Note that for any point p on the line segement, the sum of distances v 2 v 3, pv 2+pv 3 is a constant. If v 2v 1=v 3v 1, the fourth dominant point can be an arbitray point on the line segment v 2 v 3. Unlike the scenario Fig. 4 a where all pairs of dominant points are open, the point set in Fig. 4 b is reducible because (v 2,v 3) forms a closed pair.

Figure 4 c shows a scenario where the fourth dominant point is insertible. Therefore, the algorithm will continue to locate the fifth dominant point. Reducibility is thus not applicable in this scenario.

The rationale of introducing a min-sum-distance can also be found by a Fermat point and its generalization called a geometric median. Given three points in a plane, the Fermat point is the point in the plane that minimizes the sum of distances from itself to the three points. The Fermat point can be computed analytically, as illustrated in Fig. 5. Specifically, we first construct an equilateral triangle (in dash lines) for each edge of the given triangle (in solid lines), and then connect the outmost vertex of each equilateral triangle to a given vertex outside the equilateral triangle by a red line. The Fermat point must be the intersection of three red lines. Given m points in a plane, the geometric median is the point in the plane that minimizes the sum of distances from itself to the m points.
Fig. 5
Fig. 5

The Fermat point. Given three points (the black dots) in plane, the Fermat point (the red dot) is the point in the plane that minimizes the sum of distances from itself to the three points. The Fermat point can be computed analytically

A subtle difference between computing a geometric median and computing a dominant point under the min-sum-distance scheme is that the search space in the former problem is a continous and infinite plane, while the search space in the latter problem is a discrete and finite point set. For convenience, we call the problem of computing a dominant point under a min-sum-distance scheme discrete geometric median.

Max/min-sum-distance measure

The three scenarios illustrated in Fig. 4 show that a max-sum-distance scheme is not able to handle various structures of a point set to extract dominant points. A min-sum-distance scheme is expected to be integrated with a max-sum-distance scheme adaptively. In other words, we have to compare both schemes, a max-sum-distance and a min-sum-distance, and select a better scheme under some criterion to extract the next dominant point during the stage of growing the geometric sequence of dominant points. For convenience, we call the problem as max/min-sum-distance measure problem.

To solve the max/min-sum-distance problem, we propose a balance-oriented criterion to select an optimal measure between max-sum-distance and min-sum-distance measure as follows: balance the distance from a new dominant point to its two neighboring dominant points in the new geometric sequence. Specifically, given a point set P and a geometric sequence of dominant points S k−1=〈v 1,…,v k−1〉, we have two candidates of a new dominant point: i) \(v^{\max }_{k}\) according to the max-sum-distance measure, and ii) \(v^{\min }_{k}\) according to the min-sum-distance measure. For each candidate, \(v^{\max }_{k}\) or \(v^{\min }_{k}\), we first compute the optimal location to insert by maximizing the confidence of a derived sequence, i.e.,
$$\begin{array}{@{}rcl@{}} i^{*,\max} & = &\text{argmax}_{i} conf\left(\left\langle S_{k-1,i}, v^{\max}_{k} \right\rangle\right) \\ i^{*,\min} & = &\text{argmax}_{i} conf\left(\left\langle S_{k-1,i}, v^{\min}_{k} \right\rangle\right) \\ \end{array} $$
We then compute a distance ratio with respect to \(v^{\max }_{k}\), \(v^{\min }_{k}\) and their neighboring dominant points as follows:
$$\begin{array}{@{}rcl@{}} r^{\max} & = & \frac{\min\left(\left\|v^{\max}_{k} - v_{i^{*,\max}}\right\|, \left\|v^{\max}_{k} - v_{i^{*,\max}+1}\right\|\right)}{\max\left(\left\|v^{\max}_{k} - v_{i^{*,\max}}\right\|, \left\|v^{\max}_{k} - v_{i^{*,\max}+1}\right\|\right)} \\ r^{\min} & = & \frac{\min\left(\left\|v^{\min}_{k} - v_{i^{*,\min}}\right\|, \left\|v^{\min}_{k} - v_{i^{*,\min}+1}\right\|\right)}{\max\left(\left\|v^{\min}_{k} - v_{i^{*,\min}}\right\|, \left\|v^{\min}_{k} - v_{i^{*,\min}+1}\right\|\right)} \\ \end{array} $$

Note that both r max and r min range from 0 to 1. A larger value indicates more balanced distance from a candidate to its optimal neighbors. So, if r maxr min, the candidate \(v^{\max }_{k}\) is selected. Otherwise, \(v^{\min }_{k}\) is selected.

It is intuitive that dominant points selected by a balance-oriented criterion have a better description of the global structure of a point set, and thus they can provide more reliable estimation of geometric properties of a point set, such as curvature. In contrast, if a dominant point has an imbalanced distance ratio to its two neighbors, the estimation of a geometric property at this dominant point will be very sensitive to the localiation error of itself and its neighbors.

It is easy to see that the balanced-oriented criterion on distance measure can address the above-mentioned three scenarios very well. It is also not difficult to show that the max/min-sum-distance measure can be consistently selected if a point set forms a convex polygonal shape.

Limitation of max/min-sum-distance measure

Figure 6 shows a scenario where a point set contains an inflection point \(v_{4^{\prime \prime \prime }}\). Since an inflection point is neither a local maximum or a local minimum, the max/min-sum-distance measure fails to extract such a point as a dominant point at an “early” round. (It is possible that an inflection point can be eventually extracted after a few more rounds.) More specifically, three dominant points v 1,v 2 and v 3 are first extracted without any problem. Based on max/min-sum-distance measure, there are two candidates for the 4th dominant point: i) \(v_{4^{\prime }}\) according to max-sum-distance measure, and ii) \(v_{4^{\prime \prime }}\) according to min-sum-distance measure. However, the inflection point \(v_{4^{\prime \prime \prime }}\) has a more balanced distance ratio to its neighboring dominant points v 1 and v 2 than \(v_{4^{\prime }}\) and \(v_{4^{\prime \prime }}\). Thus, a inflection point, intuitively, can have a better description of the global structure of a point set, say, a sequence of dominant points that can response to inflection points can cover a larger number of points in a given point set than an equal-length sequence of dominant points that cann’t response to inflection points.
Fig. 6
Fig. 6

A scenario where a point set contains an inflection point \(v_{4^{\prime \prime \prime }}\). \(v_{4^{\prime }}\) and \(v_{4^{\prime \prime }}\) are two candidates for the 4th dominant point based on max/min-sum-distance. However, the inflection point \(v_{4^{\prime \prime \prime }}\) has a more balanced distance ratio to its neighboring dominant points v 1 and v 2 than \(v_{4^{\prime }}\) and \(v_{4^{\prime \prime }}\)

Max/min/median-sum-distance measure

We now generalize the max/min-sum-distance measure to a max/min/median-sum-distance measure. We follow a balance-oriented criterion to select an optimal measure among three measures: i) max-sum-distance, ii) min-sum-distance measure, and iii) median-sum-distance as follows: balance the distance from a new dominant point to its two neighboring dominant points in the new geometric sequence. Specifically, given a point set P and a geometric sequence of dominant points S k−1=〈v 1,…,v k−1〉, we have two candidates of a new dominant point: i) \(v^{\max }_{k}\) according to the max-sum-distance measure, ii) \(v^{\min }_{k}\) according to the min-sum-distance measure, and iii) \(v^{\text {mdn}}_{k}\) according to the median-sum-distance measure. For each candidate, \(v^{\max }_{k}\), \(v^{\min }_{k}\) or \(v^{\text {mdn}}_{k}\), we first compute the optimal location to insert by maximizing the confidence of a derived sequence, i.e.,
$$\begin{array}{@{}rcl@{}} i^{*,\max} & = &\text{argmax}_{i} conf\left(\left\langle S_{k-1,i}, v^{\max}_{k} \right\rangle\right) \\ i^{*,\min} & = &\text{argmax}_{i} conf\left(\left\langle S_{k-1,i}, v^{\min}_{k} \right\rangle\right) \\ i^{*,\text{mdn}} & = &\text{argmax}_{i} conf\left(\left\langle S_{k-1,i}, v^{\text{mdn}}_{k} \right\rangle\right) \\ \end{array} $$
We then compute a distance ratio with respect to \(v^{\max }_{k}\), \(v^{\min }_{k}\), \(v^{\text {mdn}}_{k}\), and their neighboring dominant points as follows:
$$\begin{array}{@{}rcl@{}} r^{\max} & = & \frac{\min\left(\left\|v^{\max}_{k} - v_{i^{*,\max}}\right\|, \left\|v^{\max}_{k} - v_{i^{*,\max}+1}\right\|\right)}{\max\left(\left\|v^{\max}_{k} - v_{i^{*,\max}}\right\|, \left\|v^{\max}_{k} - v_{i^{*,\max}+1}\right\|\right)} \\ r^{\min} & = & \frac{\min\left(\left\|v^{\min}_{k} - v_{i^{*,\min}}\right\|, \left\|v^{\min}_{k} - v_{i^{*,\min}+1}\right\|\right)}{\max\left(\left\|v^{\min}_{k} - v_{i^{*,\min}}\right\|, \left\|v^{\min}_{k} - v_{i^{*,\min}+1}\right\|\right)} \\ r^{\text{mdn}} & = & \frac{{\min}\left(\left\|v^{\text{mdn}}_{k} - v_{i^{*,{\text{mdn}}}}\right\|, \left\|v^{\text{mdn}}_{k} - v_{i^{*,{\text{mdn}}}+1}\right\|\right)}{\max\left(\left\|v^{\text{mdn}}_{k} - v_{i^{*,{\text{mdn}}}}\right\|, \left\|v^{\text{mdn}}_{k} - v_{i^{*,{\text{mdn}}}+1}\right\|\right)} \\ \end{array} $$

Note that r max,r min and r mdn range from 0 to 1. The largest value indicates the most balanced distance from a candidate to its optimal neighbors. For example, if r max is the largest value, the candidate \(v^{\max }_{k}\) will be selected.

Sequence subdivision

We propose a sequence subdivision method for the segmentation of touching embryos. The basic idea of our method is based on the detection of nonsmooth dominant points from a given geometric sequence. To build up an intuition, let us start from Fig. 7 that shows a comparison between a scenario of a concave-shape embryo and a scenario of touching embryos. Given a dominant point v i , denote v i−1 v i =v i−1v i and v i+1 v i =v i+1v i are two directional vectors centered at v i . The cross angle between these two vectors that can be computed by \(\text {acos}\frac {v_{i-1}v_{i}}{\|v_{i-1}v_{i}\|}\frac {v_{i+1}v_{i}}{\|v_{i+1}v_{i}\|}\) can be used to tell whether a dominant point is smooth or not. For simplicity, we call the cross angle of the two directional vectors centered at v i the angle of v i . Figure 7 a shows a scenario where the two hollow dots represent the two intersections between the contours of two touching Drosophila embryos, and their angles are acute. Figure 7 b shows a scenario where the contour of a Drosophila is a bean shape that contains two inflection points. The angles of two inflection points are obtuse, i.e., similar to the cross angle of other dominant points given a smooth contour.
Fig. 7
Fig. 7

A concave-shape embryo vs. touching embryos. a A scenario where the contour of a Drosophila is a bean shape that contains two inflection points. The angles of two inflection points are obtuse. b A scenario where the two hollow dots represent the two intersections between the contours of two touching Drosophila embryos, and their angles are acute

Figure 8 shows the longest subsequence output by the Algorithm 3 being applied to the geometric sequence presented in Fig. 1 b.
Fig. 8
Fig. 8

The longest subsequence output by Algorithm 3. To enhance the intuition, the subsequence is drawn over the corresponding subset of input points that are, however, not involved in Algorithm 3

Results

We tested the proposed method on BDGP (Berkley Drosophila Genome Project) [1]. BDGP images are available at the following public webpage: http://www.fruitfly.org/insituimages/insitu_images/.

We first give a visual evaluation on the proposed method. Figure 9 shows a comparison between Li’s method [4] and the proposed method on BDGP images of touching Drosophila embryos. We can observe that Li’s method fails in all three cases, while the proposed method works well. Figure 10 shows more positive results of the proposed method.
Fig. 9
Fig. 9

A comparison between Li’s method [4] (the first row) and the proposed method (the second row)

Fig. 10
Fig. 10

More positive examples of the proposed algorithm on images of touching Drosophila embryos

Next, we present a quantitative evaluation of different combinations of the parts of the proposed method in terms of detection rates. Given an image, the result is defined as a successful detection if the output by Algorithm 2 has larger than 90% overlapping region with the ground truth. The dataset contains 2000 images of BDGP Drosophila embryos. Table 1 shows the quantitative results, including the comparison of two existing methods on contour extraction of Drosophila embryos: i) Li and Kambhamettu’s method [3] that consists an initialization based on a quadratic curve model, and a refinement based on an active contour model; ii) Li’s method [4] that can detect and restore deficiencies and faults of primal sketch tokens occurring when a targeting object is surrounded by a complex background.
Table 1

Quantitative results

Method

Detection rate (%)

Li and Kambhamettu [3]

90

Li [4]

92

Proposed

94

Discussion

As we mentioned in the introduction, the proposed framework can be applied to other applications of contour extraction. The main contribution of the proposed framework is k-dominant point extraction based on a specific distance measure. There is a trade off among the three types of distance measures: max-sum, max/min-sum, and max/min/median-sum. The last one is the most sophisticate one, i.e., it can deal with concave contours, and touching scenarios of targeting objects, while the first one is the most efficient one in computation. Therefore, max-sum distance may be used in some circumstance, e.g., contours are convex. K-dominant point extraction may also be applied to more general applications beyond contour extraction, such as data clustering. In other words, the data that k-dominant point extraction is applied can have arbitrary dimensions rather than two.

Conclusions

We have proposed a geometric method for contour extraction of Drosophila embryos. Experiment results show the effectiveness of the proposed method, typically in segmenting two touching embryos in an image. The results also show the superiority of the proposed method over two previous methods. The proposed method advances the theory of control point detection by generalizing 3-dominant points to k-dominant points. The generalization includes strategies to deal with concave shapes, thus the proposed method can be applied to a wide range of applications relevant to contour extraction.

Declarations

Acknowledgements

We thank editors and reviewers constructive comments and suggestions in improving the quality of the paper.

Funding

This work was supported by National Science Foundation Grant of China 61370160, Guangdong Province Natural Science Foundation Project 2015A030313578, and Guangdong Scientific and Technological Plan Project 2015B010106005. Publication costs were funded by National Science Foundation Grant of China 61370160.

Availability of data and materials

BDGP data is available in the public website: http://www.fruitfly.org/insituimages/insitu_images/.

About this supplement

This article has been published as part of BMC Systems Biology Volume 11 Supplement 6, 2017: Selected articles from the IEEE BIBM International Conference on Bioinformatics & Biomedicine (BIBM) 2016: systems biology. The full contents of the supplement are available online at https://bmcsystbiol.biomedcentral.com/articles/supplements/volume-11-supplement-6.

Authors’ contributions

QL conceived the study, designed algorithms, implemented software, carried out analyses, and wrote paper. YG carried out analyses and wrote paper. Both authors read and approved the final manuscript.

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Authors’ Affiliations

(1)
Western Kentucky University, 1906 College Blvd., Bowling Green, 42101, KY, USA
(2)
Cisco School of Informatics, Guangdong University of Foreign Studies, Guangzhou, 510006, People’s Republic of China

References

  1. Tomancak P, et al.Systematic determination of patterns of gene expression during drosophila embryogenesis. Genome Biol. 2002; 3(12):1–14.View ArticleGoogle Scholar
  2. Kumar S, Jayaraman K, Panchanathan S, Gurunathan R, Marti-Subirana A, Newfeld SJ. Best: A novel computational approach for comparing gene expression patterns from early stages of drosophila melanogaster development. Genetics. 2002; 16(4):2037–47.Google Scholar
  3. Li Q, Kambhamettu C. Contour extraction of drosophila embryos. IEEE/ACM Trans Comput Biol Bioinforma. 2011; 8(6):1509–21.View ArticleGoogle Scholar
  4. Li Q. A primal sketch based framework for bean-shape contour extraction. Neurocomputing. 2014; 142:508–19.View ArticleGoogle Scholar
  5. Peng H, Myers EW. Comparing in situ mRNA expression patterns of drosophila embryos. In: RECOMB. Springer: 2004. p. 157–66.Google Scholar
  6. Pan JY, Balan AGR, Xing EP, Traina AJM, Faloutsos C. Automatic mining of fruit fly embryo images. In: KDD. ACM: 2006. p. 693–8.Google Scholar
  7. Puniyani K, Faloutsos C, Xing EP. Spex2: Automated concise extraction of spatial gene expression patterns from fly embryo ish images. Bioinformatics. 2010; 26(12):47–56.View ArticleGoogle Scholar
  8. Frise E, Hammonds AS, Celniker SE. Systematic image-driven analysis of the spatial drosophila embryonic expression landscape. Mol Syst Biol. 2010; 6:345. http://msb.embopress.org/content/6/1/345.View ArticlePubMedPubMed CentralGoogle Scholar
  9. Mace DL, Varnado N, Zhang W, Frise E, Ohler U. Extraction and comparison of gene expression patterns from 2d rna in situ hybridization images. Bioinformatics. 2010; 15(26(6)):761–9.View ArticleGoogle Scholar
  10. Bessinger Z, Xing G, Li Q. Localization of drosophila embryos using connected components in scale space. In: 19th IEEE International Conference on Image Processing, ICIP 2012, Lake Buena Vista, Orlando, FL, USA, September 30 - October 3, 2012. IEEE: 2012. p. 497–500.Google Scholar
  11. Li Q. A geometric framework for rectangular shape detection. IEEE Trans Image Process. 2014; 23(9):4139–49.View ArticleGoogle Scholar
  12. Li Q, Liang G, Gong Y. A geometric framework for stop sign detection. In: IEEE China Summit and International Conference on Signal and Information Processing, ChinaSIP 2015, Chengdu, China, July 12-15, 2015. IEEE: 2015. p. 258–62.Google Scholar

Copyright

© The Author(s) 2017

Advertisement