As minimum feature size keeps shrinking, and the next generation lithography (e.g, EUV) further delays, double patterning lithography (DPL) has been widely recognized as a feasible lithography solution in 20nm technology node. However, as technology continues to scale to 14/10nm, DPL begins to show its limitations and usually generates too many undesirable stitches. Triple patterning lithography (TPL) is a natural extension of DPL to conquer the difficulties and achieve a stitch-free layout decomposition. In this paper, we study the standard cell based row-structure layout decomposition problem in TPL. Although the general TPL layout decomposition problem is NP-hard, in this paper we will show that for standard cell based TPL layout decomposition problem, it is polynomial time solvable. We propose a polynomial time algorithm to solve the problem optimally and our approach has the capability to find all stitch-free decompositions. Color balancing is also considered to ensure a balanced triple patterning decomposition. To speed up the algorithm, we further propose a hierarchical algorithm for standard cell based layout, which can reduce the run time by 34.5% on average without sacrificing the optimality. We also extend our algorithm to allow stitches for complex circuit designs, and our algorithm guarantees to find optimal solutions with minimum number of stitches.