How can I convert HTML to Textile?

旧时模样 提交于 2019-12-03 11:38:46

I know this is an old question, but I found myself trying to do this the other day and not finding anything useful, until I found Pandoc. It can convert loads of other markup formats as well - it's quite brilliant.

Here is a c# lib converting html 2 textile. Though it is textile with their additions. Not pure textile.

Since there was no javascript implementation, I wrote one: https://github.com/cmroanirgo/to-textile

It's a little primitive at the moment, as it's a blind port of the 'to-markdown' equivalent, but should get the job done.

try this simple java code hope it work for you

import java.net.*;
import java.io.*;

class Crawle
{

public static void main(String ar[])throws Exception
{


URL url = new URL("https://www.google.co.in/#q=i+am+happy");
InputStream io =  url.openStream();
BufferedReader br = new BufferedReader(new InputStreamReader(io));
FileOutputStream fio = new FileOutputStream("crawler/file.txt");
PrintWriter pr = new PrintWriter(fio,true);
String data = "";
while((data=br.readLine())!=null)
{
pr.println(data);
System.out.println(data);
}

}
}
}

This is a simple markup replacement, nothing a good regex could not fix.

I recommend Perl, LWP::Simple and some regexes to do the whole thing (spidering, stripping design and menus, converting to textile, and then posting to the database.)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!