Texts appearing in images are often regions of interest. Locating such areas for further analysis can help to extract image-related information and facilitate many applications. Pixel-based segmentation and region-based object classification are two methodologies for identifying text areas in images and have their own pros and cons. In this research, a text detection scheme consisting of a pixel-based classification network and a supplemented region proposal network is proposed. The main network is a Fully Convolutional Network (FCN) employing Feature Pyramid Networks (FPN) and Atrous Spatial Pyramid Pooling (ASPP) to indicate possible text areas and text borders with high recall. Certain areas are further processed by the refinement network, i.e., a simplified Connectionist Text Proposal Network (CTPN) with high precision. Non-Maximum Suppression (NMS) is then applied to form appropriate text-lines. The experimental results show feasibility of the scheme.