Text line Segmentation in Compressed Representation of Handwritten Document using Tunneling Algorithm

In this research work, we perform text line segmentation directly in compressed representation of an unconstrained handwritten document image. In this relation, we make use of text line terminal points which is the current state-of-the-art. The terminal points spotted along both margins (left and right) of a document image for every text line are considered as source and target respectively. The tunneling algorithm uses a single agent (or robot) to identify the coordinate positions in the compressed representation to perform text-line segmentation of the document. The agent starts at a source point and progressively tunnels a path routing in between two adjacent text lines and reaches the probable target. The agent's navigation path from source to the target bypassing obstacles, if any, results in segregating the two adjacent text lines. However, the target point would be known only when the agent reaches the destination; this is applicable for all source points and henceforth we could analyze the correspondence between source and target nodes. Artificial Intelligence in Expert systems, dynamic programming and greedy strategies are employed for every search space while tunneling. An exhaustive experimentation is carried out on various benchmark datasets including ICDAR13 and the performances are reported.


Introduction
Technological advances of storage and transfer have made it possible to maintain many documents in the digital format. It is also necessary to preserve these documents in digital image format only, particularly in case of handwritten documents for verification and authentication purposes [1,2,3,4,5]. Maintaining these document images in the digital form would require huge storage space and network bandwidth. Therefore, an efficient compressed representation would be an effective solution to the storage and transfer issues [6,7,8] particularly arising from document image. Generally, the document image compression format follows the guidelines of CCITT Standards [9,10] which is a part of the ITU (International Telegraph Union). These standards were specifically targeted towards document images stored in digital libraries. On the other hand, if document images, audios and videos are to be archived and communicated in the compressed form itself, then it would be considered as a third advantage in addition to storage and transmission. The digital libraries with document images in their compressed formats could imply a solution to a big data problem arising from the document images, particularly about storage and transmission [11]. The compressed image file formats such as TIFF, JPEG, and PNG, strictly follow CCITT standards [7,8,9,10,11,12].
For any digital document analysis (DDA), the image in its compressed format must undergo the decompression stage. However, performing operations on decompressed documents would unwarrantedly suppress the advantages of both the time and buffer space [6,13,14,15]. If DDA could be achieved in the compressed version of the document image without decompression, then the document image compression could be viewed as an effective solution to the immense data problems arising from document images [13]. The idea of operating and analysing directly in the compressed version of the document image without decompression is known as Compressed Domain Processing (CDP) [6,13,14,15). A recent literature [12] on CDP shows the strategies to perform document image analysis in its compressed representation. However, the model is subjected to the printed documents only. The challenging job is to perform DDA in compressed representation of handwritten document image. Performing DDA on uncompressed handwritten document could be a difficult task because of oscillatory variations, inclined orientation and frequent touching of text lines while scribing on un-ruled paper [13]. An initial attempt [13] to spot the separator points at every line terminals in compressed unconstrained handwritten document images using run-length features (Run Length Encoding or RLE) is the state-of-the-art technology. This motivates to carryout text-line segmentation in compressed document image. In this research, we have attempted to segment the text lines in the compressed representation with the help of these spotted terminal points. In this context, a research [16] shows a method of extracting run-length features from the compressed image format supported by CCITT group 3 and CCITT group 4 standards. These protocols use run-length encoding, which is widely accepted for binary document image compression. Our paper aims to segment the text lines of the document image using the run-length data which is represented in a grid. The detailed explanation about the RLE format of the document image is provided in Section 3. The current state-of-the-art uses a single column (first column) of the grid to find the separator points at every line terminal along left margin of the document image. But the last column of RLE does not infer the depth of right margin of the document for spotting separator points. To work on the right margin of the document image, the final non-zero entry of every row in RLE data was observed for creating a virtual column and thus separator points are spotted. Now, these identified separator points at both the margins are considered as source (separator points along left margin) and target (separator points along right margin) nodes. In this paper, we make use of single agent (or robot) for text-line segmentation. The agent's job is to tunnel or trace the path by navigating between the two-adjacent text-lines starting from a source point at one end of a document and reaching the destination point present at the other end of the document. This process is applied to all the source nodes resulting in segmentation of all text-lines. Some interesting observations inculcated includes analysing the correspondence between the source nodes and terminal nodes. We also addressed some of the issues related to wrong segmentation that is when two or more paths converge to one destination traced by the agent. Corrective measurements are taken to tackle this issue by understanding the correlation between two adjacent paths. Our approach identifies the text line positions operating directly in the RLE representation of the document images which results in text-line segmentation. The rest of the paper is organized as follows. A survey of related research is presented in section 2. In section 3, we have explained the RLE representation of a document image. Further, we provide the terminologies and corollaries used in this paper. Section 4 describes the algorithmic modelling along with comparative time complexity with respect to both RLE and uncompressed document versions. Further, experimental results conducted on benchmark handwritten datasets such as ICDAR 13 and other databases [17] are described in Section 5. Section 6 summarizes the research work with future avenues.

Related Works
In the recent past, we could trace few related works on CDP, but restricted to printed document images. A literature [15] on CDP provides a detailed study on document image analysis techniques from the perspectives of image processing, image compression and compressed domain processing. This enabled various operations carried out in the field of skew detection / correction, document matching, document archival. Recent article [18], demonstrates straight line detection in RLE representation of the handwritten document image. Incremental learning-based text-line segmentation in compressed handwritten document images are reported in literature [19]. Further, there was a technical article [13] that discusses about performing direct operations on the compressed representation of handwritten document. The effort is to spot the separator points at every text lines in both margins of the document image enabling text line segmentation. For better understanding, the identified line terminal points applied over an uncompressed document image is shown in Figure 1. One of the significant models [20] uses seam carving approach to segment the text-line that works on historical documents of uncompressed images. Another recent study [21] discusses various path finding algorithms which are a class of heuristic algorithms based on Artificial Intelligent. These techniques are domain specific that works on uncompressed document images. However, the ideas of path finding approaches are inculcated in our proposed model, that is to operate on compressed version for decision-making in every search space. Further, an effort was made to carry out the text-line segmentation directly in compressed handwritten document images to avoid decompression [13] which is the main objective of this paper.

Compressed Image Representation and Corollaries
The Modified Huffman (MH) [7,8] is the most common encoding technique following CCITT compression standards which is supported by all the facsimiles. The improved compression versions are Modified Read (MR) [7,8,9,10] and Modified Modified Read (MMR) [7,8,9,10]. A comparative study on encoding / decoding techniques of these compression standards are tabulated (Table 1). MH uses RLE for an efficient coding, whereas MR and MMR exploit the correlation between successive lines of a document image.    The odd and even columns of the RLE data represent the count of white-runs and the black-runs of an uncompressed document image. The number of continuous pixel value, say '0' (background), is called as white-runs [22,23,24,25,26]. Similarly, the number of continuous pixel value, say '1' (foreground) is called as black-runs [22,23,24,25,26]. If a row in an image starts directly with foreground, then the value in the first column of the corresponding row in RLE has an entry zero ('0'). This infers that odd columns and even columns in the grid contain count of white-runs and black-runs respectively. The white-runs represent background whereas the black-runs represent foreground as text-content.
Some of the corollaries used in this article are listed here: a. Each separator points spotted in the first column of the Grid ( ) is considered as Source Node ( ). b. Each separator points spotted in the virtual column is considered as Target Node ( ). Final non-zero entry in every row of G makes this virtual column. c. A path ( ) is defined as segmentation of adjacent text lines; an agent (or robot) navigates along blank space (whiteruns) available between two adjacent text-lines from to with reference to . d. Edges ( ) are defined as sequences of white-runs that exist or tunneled between two adjacent text-lines. These sequence of formulates . e. An obstacle ( ) appears because of touching characters between two adjacent text-lines, which is a hurdle while tracing from to . This may also occur due to ascenders and descenders of characters, which appears larger in length when compared to a given threshold ( ). f. t is the length covered by an agent tracing both the direction vertically up and down ( ) from the current position , where arguments and refer to the position in along (columns) and (rows) axis respectively; this search space is induced whenever there is an . is chosen empirically based on experiments conducted on the datasets. g. For every , there is a . The relationship between and is one-to-one function defined as and strictly follows the ascending order. The correspondence of and is not defined in the literature [13]. However, in this paper, we could find the probable correspondence between and with help of an agent (or robot) and algorithmic strategies. h. When relationship between and is onto function or crossover, then it is observed as wrong segmentation. Even though and hold a relation one-to-one and there exist crossovers, then it is considered as wrong segmentation. i. The distance ( ) calculated between and denotes the shortest covering longest distance (weights), which is defined by a function, . depends on the number of edges ( ) used in by an agent to cover between and . j. The terminal points and does not necessarily possess larger white-runs because, they are heuristically chosen as mid-points of the bands corresponding to the two-adjacent text-lines along axis (columns) of . Therefore, may not be an actual start point, in the given situation, whereas the weight of the is longer than . So, the actual start point would be and it would be within the search space of from . k. The distance between two intermediate hubs such as and must be equal or greater than a given threshold ( ). is calculated by finding a maximum number of bins observed from the histogram of with respect to odd columns (white-runs) only. Some of the assumptions concerning the state space search (tunnel or traversal or ) are given below: a. and are finite. b. There must be at-least one between and c. One agent (or robot) is employed at a time for tracing . d. There is no self-loop in (a cycle of length one). e. Tunnel (Move to next state) if exist on . Figure 5 shows the terminal points of and in an uncompressed version for better understanding. It depicts 10 source nodes and 10 target nodes. The one-to-one function for the figure 5 is defined below: Based on the observation, the source does not have a corresponding target . Therefore, an agent starting from may reach or , where is presumed as a new (correspondence) or . This infers that the agent may reach closer or nearer to but not exactly the target , which is illustrated in the following section.

Text-line Segmentation
Grid ( ) of a document image starts with a column of whiterun(s). The starting column of is referred [13] as a white space that exists at the beginning of a text document. A larger depth (white-runs) in this column indicates the text-line separation gap (bands) along left margin of the document image between the corresponding adjacent text-lines. Mid points are chosen as terminal points (Source nodes) from the constructed bands (corollary ) as shown in the figure 5. In this context, these points do not necessarily possess larger white-runs. Therefore, the agent needs to choose the starting point S among its vertical neighbors having largest white-runs among its vertical neighbors within the range from its current position. Now, assuming that the agent's start position S is fixed based on the corollary and this state is at an initial search state . Next, the agent reaches at a station or hub ( ) starting from resulting in a new search state . An edge connecting between these two stations and is the shortest path travelled by the agent to cover maximum distance and obviously heading towards the probable target. The distance or weight, from to an intermediate hub is computed directly from the position, say , then it is inferred in the first column of . The search process continues until the agent reaches the corresponding or new target (probable destination). The basic idea is that for every , the agent needs to choose the next hub which is nearer to or by selecting the white-runs from the current position in and thus ensures that a direct edge exists between the two stations (hubs). In other words, the agent selects the largest white-runs from to adhoc, to navigate to the next hub closer to or . At every , greedy strategy is employed to identify the next successive hub h' by choosing largest white-runs (or largest weight) from the current and henceforth the agent reaches or using a minimum number of with respect to the hubs visited. The agent hopping hubs starting from to or results in creating a path . is based on the edges connected between the intermediate hubs starting from to or and this indicates a progressive segmentation of the two-adjacent text-lines in a document image, starting from the left margin and precedes towards the right direction. If the total search space for a given is one, then this indicates that the agent has visited only two hubs and those are and respectively; this articulates direct segmentation without much variations exist because of skew, curvy and touching lines in the handwritten document image. This situation sometimes would be pronounced in-between two paragraphs in the document image holding a large horizontal space between them and may occur in the top and bottom of the documents which is entitled to have an adequate margin spaces. However, the whole operation is carried out with reference to . Consider that the agent reaches an intermediate hub which is at , then we perform local search by employing corollary to reach to the next hub ( ). This situation is related to Hill-Climbing [27] technique in Artificial Intelligence. If the hubs h and h' visited by the agent are located in the positions say and respectively, then the search space (local maxima) is written as.
The selection of each successive hub is based on the distance d computed by summing up the entries from successive columns in along axis starting from the first column with respect to the y coordinate position. The distance computation is carried out for every coordinate position which varies within the range of from either side of position, , as given in the above equation. The job of the agent is to select the largest weight (distance) among these computed cumulative values which crosses the current hub (intermediate hub). However, measuring the distances for every SS ranging between to (both inclusive) along axis is computationally expensive. In computing, memoization [28] or memorize [28] is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again. To avoid this re-computation for every , we use memoization or memoize technique in dynamic programming strategy to remember the last computed of previous which in-turn reduces the computational complexity. The gap or distance between two successive hubs must be within a minimum distance of a given threshold . is chosen heuristically based on the experiments conducted on the benchmark datasets. A common issue encountered as mentioned in the corollary is when two adjacent text-lines are touching one another. In this case, we just bypass or crossover the lines [29]. Algorithmically, we choose next successive column (black-runs) with respect to the current position. In other case, the same strategy is extended when the ascenders and descenders appear to be greater than the given threshold value , where the agent would not be able to go beyond the search limit, say . One of the important issues mentioned in corollary is when a distance between hubs is greater than . is given below.

t'= maximum(histogram(values in odd colums of G))
where values in odd columns of G represents white runs In this case, we calculate t' by considering the largest bin from the histogram of G with respect to the counts of white runs (odd columns). If the distance is found to be larger than t', then we apply both the concepts namely 'don't care condition' [29] and 'minimal cut-sets' [30] from graph theoretical domain. Therefore, with a minimal number of crossovers, the agent bypasses and reaches the next successive hubs. Finally, the agent will update its knowledge base about each visited hub with reference to G covering the distance from S to T or T'. Below we present the proposed algorithm for text-line segmentation using .

Time Complexity for Text-line Segmentation
The following provides the time complexity estimated for the proposed algorithm to identify the text-line segmentation positions in compressed representation of a document image. The worst-case scenario is calculated as where and represent the number of rows and number of columns of the grid respectively. Usage of memoize technique [28] in dynamic programming strategy has reduced to the complexity from , that is which is a nondeterministic polynomial.
For comparative study, the time complexity is also estimated for a document image in its uncompressed format. For this, the worst-case scenario is where and represent the height and width of an uncompressed image respectively. It is noted that is equal to whereas is lesser than . Time complexity of spotting the line terminal points mentioned in the literature [13] is estimated as in the worst-case scenario. Therefore, the complexity for the proposed algorithm is computed by cascading the two algorithms which result in . Finally, the proposed method takes . The overall computation is tabulated ( Table 2). The efficacy of the proposed algorithm for both compressed and uncompressed versions is shown in Figure 6. The compressed version requires less time, when compared to that of uncompressed images for text line segmentation.

Illustration
In this section, we illustrate the working principle of Tunneling algorithm using an example. The coordinate positions are tracked by the agent (or robot) for text line segmentation with respect to is shown in Table 3. For better understanding, an experimental result is shown in the uncompressed version as well. Figure 7 shows segmentation of two text-lines of a document image. Figure 8 illustrates the textline segmentation depicting transition hubs and a path tunneled by an agent. From the example, the agent starts from the Source Node ( ) and reaches the Target Node ( ) visiting six intermediate hubs such as Hub 1, Hub 2, Hub 3, Hub 4, Hub 5 and Hub 6. Here, the source node is identified in a position, say . The agent predicts that the Hub 1 in the position could be an effective source node than the initial source [13] as articulated in corollary . This is because the weight (distance) of Hub 1 is larger when compared to the initial source as illustrated in table 3. The total distance covered by the agent is 2085 units. The threshold is chosen as 20 units and the total computation to cover the distance along coordinates in both the direction by the agent ranges from to , i.e., to respectively. We employ memoization technique to reduce the complexity. Therefore, we make use of a knowledge base ( ) which would be empty initially. Once we compute the distances for both Source and Hub 1, we feed all the weights (distances) ranging from to , i.e., to respectively into the KB. The size of the KB would be a column size of . The triggers the agent to choose next hub as Hub 2 as a successor of the Hub 1 which covers the distance of 1295 units and perhaps the distance is greater than the distance of current hub, that is the Hub 1 possess 1058 units. Next, agent archives the information from the KB to avoid the total computation.

Choosing of Source Node
In this section, we introduce a mechanism to find the start node. Figures 9 (a) and 9(b) show a sample image and its compressed representation in respectively. Here, the source node is not an optimal start point as mentioned in corollary . If we consider the threshold for illustration purpose, then the search scope will be of range covering from to . Though the position is extended beyond the grid boundary as described in the algorithm in step 2, we redefine the search range starting from . Because of , the two positions such as and are identified as the largest among the search range. In this case, we take the source node that is in the position than latter, because is nearer or closer to the initial source node which is . Finally, we choose as a start node, and thus the distance covered by the agent would be 7 units and holding a largest weight covering maximum distance and naturally precedes or . its RLE format. The green tile is the initial start node and the blue tile is the actual start node. A red line represents the distance from source node to the hub.

Handling Obstacles
(a) (b) Fig. 10. Visualization of a search space tunneling an obstacle. Fig (a) represents the spatial coordinate of an image and Fig (b) represents its RLE format. The green tile is the initial start node and the blue tile is the actual start node. Blue tiles represent the intermediate hub and red tiles represents obstacle. A red line represents the distance from source node to the hub.
This section provides the working principle of handling obstacles. We encountered two types of obstacles while tunneling path as mentioned in corollary e. Generally, an obstacle occurs due to unconstrained writing style especially while scribing on an unruled paper. One of the possibilities is when two adjacent lines are touching one another at some positions. The other possibility would be when the ascenders or desceders of the characters extended beyond the search limit or range of the given threshold ( ). For illustration purpose, we have considered an example which is shown in Figure 10(a) and Figure 10(b) representing spatial domain and its compressed version of a document image. We have chosen the threshold ( ) for demonstration. In this example, an initial source node is assumed, and it holds a position, . Previous section detailed about choosing appropriate starting node and applying that concept results in a new position, say as a new Start Node. The new source node has a maximum weight or distance (say 7 units) and falls within the range of , which is closer to the initial source node . Further, the first successor hub is chosen as which carries a weight of 6 units and covers the distance of 11 units from the left margin. It is chosen based on fact that the distance, say 11 units, is greater than the distance (weight) of source node that covers the distance of 7 units. The other facts include the search space (calculated distance) between the range and is either lesser or equal to the current position and this node aligns closer to its predecessor node along axis. Further, next hopping hub is complicated because we encounter an obstacle along the coordinate. Further, the chosen threshold value ( ) is restricted within the search space. In this situation, we can crossover the hurdle by choosing the next hub as which is identified as a successor hub.

Finding Correspondence between Source and Target
Identification of correspondence between the terminal points residing at opposite margins is one of the challenging aspects as mentioned in the study [13]. In this article, actual source node is chosen based on the weight (distance) which is distributed across its neighbors. The destination is based on the search space of the proposed model. The agent may start at a new source position and would reach the probable destination, and this articulates the correspondence between these two nodes present at extreme ends of the document. This is applicable for all the source nodes and target nodes. Figure 11 shows the correspondence between the source and terminal nodes and the text-line segmentation. In section 3 we have described the correspondence of the nodes notably having one-to-one relation and strictly no onto relationship and additionally no crossovers are allowed between the terminal points. This could be related to the bi-partite graph theoretical approach and thus maintaining the order or position. The problem of over separation mentioned in the study [13] highlights the aspect of spotting two or more separator points within two adjacent text-lines. This occurs when a text line is identified as a non-text (white space) region. One of the reasons for over separation is because of concavity of the character. The other affecting factor could be multiple disjoint fractions or components which compose a character. Figure 12 depicts the problem of over separation where source nodes and start between two adjacent text-lines. Taking an average distance between the separator points may not produce the expected result as experimented in the research [13]. Our proposed method tackles this problem by understanding correlation between the adjacent paths. In this example, we have two source points such as s1 and s2 and the paths traced by the agent reaches the terminal points and respectively. It is evident that the terminal points and are very much closer to one another. Even both the paths traced are aligned together along coordinates most of the time. To resolve this issue, we need to ignore the path(s) which relate to more hubs (nodes). In other words, it is to retain the path which holds less intermediate hubs. It is also necessary to retain a path which is relatively parallel to both of its predecessor and successor paths. Sometimes, this analogy may fail when a text line occupies less space when compared to its predecessors as shown in figure 13. This occurs mostly with a last text-line of a paragraph having less number of words. However, this situation is addressed by understanding the average distance between the two adjacent paths as mentioned above. It is computed by We illustrate the distance between the adjacent paths with an example. Figure 14 shows the paths traced along the two text-line gaps. Table 4 shows the coordinate positions of paths traced along axis. We assume as 500 for illustration purpose. For every interval , we take the coordinate positions for both paths. Table 5 shows coordinate positions for the paths with an interval gap of . The distance calculated for the paths (path 1 and path 2) is given under.

Datasets
Our proposed method is evaluated on various benchmark handwritten datasets such as ICDAR'13 [31] and others [17] comprising of Kannada, Oriya, Persian and Bangla documents. For experimental purpose, the compression standards for these datasets are preserved as discussed in one of the studies [16].

Results
Our proposed method is evaluated based on two factors that we came across in a study [31] -(i) Detection Rate ( ) and (ii) Recognition Accuracy ( ). and are defined as follows: The DR in spotting the separator points at both the left side and the right side of the document is provided in literature [13] and also been tabulated (Table 6).  Table 7 shows the result of the proposed model applied on the benchmark datasets. Our method focuses only on a terminal point that is spotted on the left side of the document, so the RA for segmenting the text line, entirely depends upon the separator points spotted along left margin of the document. Figure 15 shows the comparative performance analyses of both DR and RA for various benchmark datasets. It is evident that the algorithm works better in the case of ICDAR datasets. We also witness that the RA for the dataset of Persian script is lower when compared to other datasets. This is because of the reason that the Persian writing style starts from right end of the document and precedes towards left direction unlike most of the other scripts. The other reasons include more concavity and disjoints in the composition of the character. The algorithmic modeling deals with choosing a common threshold value ( ) for every search space to handle the obstacle is based on experimentation conducted on the datasets. Figure 16 shows the RA for different threshold values with respect to various datasets. The threshold value with 20 units provides better accuracy rate with respect to ICDAR13, Kannada, Oriya and Bangla. Whereas the threshold value possesses a unit value of 28 which elevates the accuracy rate in case of Persian script. In this research work, we have chosen the threshold value as 20 units which is common to all the datasets. Real-time performance is measured employing the algorithm on various benchmark datasets to understand the working principle of the model. Figure 17 shows the comparative study of both (i) compressed and (i) uncompressed versions of dataset. Here, we witness that the processing time of compressed version of document takes lesser time unit (milliseconds) when compared to uncompressed version of documents. We also observe that there is an invariable exponential increase in the amount of time for performing text-line segmentation in case of uncompressed documents with increase in the number of images. Finding correspondence between and is a challenging task. However, we have attempted to find the correspondence based on the correlation between the adjacent paths as illustrated in Section 4.5. The first of the document starts from the very first terminal point identified in the first column of . is presumed as a reference line. Now, the correlation is calculated between a new path and the reference line with an interval of . The same is repeated for the following paths with reference to source nodes. Figure 18 shows the RA because of the model.

Conclusion
In this paper, we have proposed an algorithm that segments the text-line of a document image by operating directly in its compressed representation. We make use of current state-of-theart technology that identifies the terminal positions at every textlines in the compressed representation. We have shown the working principle of the algorithm with an illustration. An agent (or robot) has been involved to tunnel the path between the terminal points spotted at the opposite extreme ends of the document representation. We have discussed the search space for tunneling the path when it encounters obstacles because of oscillation, tilt, touching and curvy text-lines. We have advocated the reasons for using the threshold values with respect to search space by experimental results. Comparative study of processing uncompressed and compressed versions for text-line segmentation is also been detailed. Further, a significant reduction in time complexity when compared to processing uncompressed image for segmentation is showcased. Some of the interesting observations are addressed to overcome the limitations [13] such as finding the correspondence between the terminal points. Interestingly, the tunnel or path traced that results in segmenting two adjacent text lines are entirely based on the terminal points. The source terminal points are predicted based on the search space, SS, whereas the corresponding target points are identified by the agent. Though working directly on an uncompressed version of the document is very challenging, we have designed a model that operates directly on compressed representation of the document images for text-line segmentation. We could achieve the recognition accuracy of 89.2% using the benchmark dataset (ICDAR13). The confidence level in achieving 89.2% with respect to this dataset is 100%.
The proposed model has some of the limitations such as working with invariable skew or tilt levels. The accuracy rate depends upon the detection rate observed from the extensive literature and a common threshold value t for entire datasets. Other limitation includes calculating the correlation between the paths based on reference line. One of the future avenues include working on bidirectional search where we employ two agents starting simultaneously from opposite terminal points and meets at a junction. Employing multi-agents leads to parallel processing which would improve the performance of the system. We have followed unguided medium to tunnel the path by adapting various algorithmic strategies including greedy and dynamic programming with AI techniques. We could possibly avoid wrong segmentation by using guided medium [4] along with the strategy which would be one of the future direction of this research work.