I have input files with the structure like the next:
a1
b1
c1
c2
c3
b2
c1
d1
d2
b3
b4
a2
a3
b1
b2
c1
c2
I recently had to do something similar enough that with a few tweaks I can post my script here:
#!/bin/bash
prev_level=-1
# Index into node array
i=0
# Regex to screen-scrape all nodes
tc_re="^(( )*)(.*)$"
while IFS= read -r ln; do
if [[ $ln =~ $tc_re ]]; then
# folder level indicated by spaces in preceding node name
spaces=${#BASH_REMATCH[1]}
# 2 space characters per level
level=$(($spaces / 2))
# Name of the folder or node
node=${BASH_REMATCH[3]}
# get the rest of the node path from the previous entry
curpath=( ${curpath[@]:0:$level} $node )
# increment i only if the current level is <= the level of the previous
# entry
if [ $level -le $prev_level ]; then
((i++))
fi
# add this entry (overwrite previous if $i was not incremented)
tc[$i]="${curpath[@]}"
# save level for next iteration
prev_level=$level
fi
done
for p in "${tc[@]}"; do
echo "${p// //}"
done
Input is taken from STDIN, so you'd have to do something like this:
$ ./tree2path.sh < ifile.tree
a1/b1/c1
a1/b1/c2
a1/b1/c3
a1/b2/c1/d1
a1/b2/c1/d2
a1/b3
a1/b4
a2
a3/b1
a3/b2/c1
a3/b2/c2
$
interesting question.
this awk (could be one-liner) command does the job:
awk -F' ' 'NF<=p{for(i=1;i<=p;i++)printf "%s%s", a[i],(i==p?RS:"/")
if(NF<p)for(i=NF;i<=p;i++) delete a[i]}
{a[NF] =$NF;p=NF }
END{for(i=1;i<=NF;i++)printf "%s%s", a[i],(i==NF?RS:"/")}' file
you can see above, there are duplicated codes, you can extract them into a function if you like.
test with your data:
kent$ cat f
a1
b1
c1
c2
c3
b2
c1
d1
d2
b3
b4
a2
a3
b1
b2
c1
c2
kent$ awk -F' ' 'NF<=p{for(i=1;i<=p;i++)printf "%s%s", a[i],(i==p?RS:"/")
if(NF<p)for(i=NF;i<=p;i++) delete a[i]}
{a[NF] =$NF;p=NF }END{for(i=1;i<=NF;i++)printf "%s%s", a[i],(i==NF?RS:"/")} ' f
a1/b1/c1
a1/b1/c2
a1/b1/c3
a1/b2/c1/d1
a1/b2/c1/d2
a1/b3
a1/b4
a2
a3/b1
a3/b2/c1
a3/b2/c2