Convert tabbed text to html unordered list?

前端未结

关注

 4  618

难免孤独 2021-01-03 03:11

I\'m a beginner programmer so this question might sound trivial: I have some text files containg tab-delimited text like:


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   孤城傲影
                                             
                
                
                (楼主)
            
              
              
                2021-01-03 03:58
              

            
            
                        
The algorithm is simple. You take the depth level of a line that is indicated with a tab \t and shift the next bullet to the right \t+\t or to the left \t\t-\t or leave it at the same level \t. 

Make sure your "in.txt" contains tabs or replace indent with tabs if you copy it from here. If indent is made of blank spaces nothing works. And the separator is a blank line at the end. You can change it in the code, if you want.

J.F. Sebastian's solution is fine but doesn't process unicode.

Create a text file "in.txt" in UTF-8 encoding:

qqq
    www
    www
        яяя
        яяя
    ыыы
    ыыы
qqq
qqq


and run the script "ul.py". The script will create the "out.html" and open it in Firefox.

#!/usr/bin/python
# -*- coding: utf-8 -*-

# The script exports a tabbed list from string into a HTML unordered list.

import io, subprocess, sys

f=io.open('in.txt', 'r',  encoding='utf8')
s=f.read()
f.close()

#---------------------------------------------

def ul(s):

    L=s.split('\n\n')

    s='\n\
List Out'

    for p in L:
        e=''
        if p.find('\t') != -1:

            l=p.split('\n')
            depth=0
            e=''
            i=0

            for line in l:
                if len(line) >0:
                    a=line.split('\t')
                    d=len(a)-1

                    if depth==d:
                        e=e+''+line+''


                    elif depth < d:
                        i=i+1
                        e=e+''+line+''
                        depth=d


                    elif depth > d:
                        e=e+'
'*(depth-d)+''+line+''
                        depth=d
                        i=depth


            e=e+'
'*i+'
'
            p=e.replace('\t','')

            l=e.split('')
            n1= len(l)-1

            l=e.split('')
            n2= len(l)-1

            if n1 != n2:
                msg='Wrong bullets position.
<ul>: '+str(n1)+'
<⁄ul>: '+str(n2)+'
 Correct your source.'
                p=p+msg

        s=s+p+'\n\n'

    return s

#-------------------------------------      

def detach(cmd):
    process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
    sys.exit()

s=ul(s)

f=io.open('out.html', 'w',  encoding='utf8')
s=f.write(s)
f.close()

cmd='firefox out.html'
detach(cmd)


HTML will be:


List Outqqq
www
www
яяя
яяя
ыыы
ыыы
qqq
qqq

    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复