Indented lines (tree) to path-like lines

后端 未结 2 507
失恋的感觉
失恋的感觉 2021-01-03 05:46

I have input files with the structure like the next:

a1
  b1
    c1
    c2
    c3
  b2
    c1
      d1
      d2
  b3
  b4
a2
a3
  b1
  b2
    c1
    c2


        
相关标签:
2条回答
  • 2021-01-03 06:40

    I recently had to do something similar enough that with a few tweaks I can post my script here:

    #!/bin/bash
    
    prev_level=-1
    # Index into node array
    i=0
    
    # Regex to screen-scrape all nodes
    tc_re="^((  )*)(.*)$"
    while IFS= read -r ln; do
        if  [[ $ln =~ $tc_re ]]; then
            # folder level indicated by spaces in preceding node name
            spaces=${#BASH_REMATCH[1]}
            # 2 space characters per level
            level=$(($spaces / 2))
            # Name of the folder or node
            node=${BASH_REMATCH[3]}        
            # get the rest of the node path from the previous entry
            curpath=( ${curpath[@]:0:$level} $node )
    
            # increment i only if the current level is <= the level of the previous
            # entry
            if [ $level -le $prev_level ]; then
                ((i++))
            fi
    
            # add this entry (overwrite previous if $i was not incremented)
            tc[$i]="${curpath[@]}"
    
            # save level for next iteration
            prev_level=$level
        fi
    done
    
    for p in "${tc[@]}"; do
        echo "${p// //}"
    done
    

    Input is taken from STDIN, so you'd have to do something like this:

    $ ./tree2path.sh < ifile.tree 
    a1/b1/c1
    a1/b1/c2
    a1/b1/c3
    a1/b2/c1/d1
    a1/b2/c1/d2
    a1/b3
    a1/b4
    a2
    a3/b1
    a3/b2/c1
    a3/b2/c2
    $ 
    
    0 讨论(0)
  • 2021-01-03 06:43

    interesting question.

    this awk (could be one-liner) command does the job:

    awk -F'  ' 'NF<=p{for(i=1;i<=p;i++)printf "%s%s", a[i],(i==p?RS:"/")
                if(NF<p)for(i=NF;i<=p;i++) delete a[i]}
                {a[NF] =$NF;p=NF }
                END{for(i=1;i<=NF;i++)printf "%s%s", a[i],(i==NF?RS:"/")}' file
    

    you can see above, there are duplicated codes, you can extract them into a function if you like.

    test with your data:

    kent$  cat f
    a1
      b1
        c1
        c2
        c3
      b2
        c1
          d1
          d2
      b3
      b4
    a2
    a3
      b1
      b2
        c1
        c2
    
    kent$  awk -F'  ' 'NF<=p{for(i=1;i<=p;i++)printf "%s%s", a[i],(i==p?RS:"/")
    if(NF<p)for(i=NF;i<=p;i++) delete a[i]}
    {a[NF] =$NF;p=NF }END{for(i=1;i<=NF;i++)printf "%s%s", a[i],(i==NF?RS:"/")} ' f
    a1/b1/c1
    a1/b1/c2
    a1/b1/c3
    a1/b2/c1/d1
    a1/b2/c1/d2
    a1/b3
    a1/b4
    a2
    a3/b1
    a3/b2/c1
    a3/b2/c2    
    
    0 讨论(0)
提交回复
热议问题