Making multiple files from multiple files with one command in gnu make

回眸只為那壹抹淺笑 提交于 2019-12-22 14:03:02

问题


Assume 1000 files with extension .xhtml are in directory input, and that a certain subset of those files (with output paths in $(FILES), say) need to be transformed via xslt to files with the same name in directory output. A simple make rule would be:

$(FILES): output/%.xhtml : input/%.xhtml
    saxon s:$< o:$@ foo.xslt

This works, of course, doing the transform one file at a time. The problem is that I want to use saxon's batch processing to do all the files at one time, since, given the number of files, that would be much faster, considering the overhead of loading java and saxon for each file. Saxon allows the -s (source) option to be a directory and processes all files in that directory, placing the results with the same name in the directory specified in the -o: option.

I'm aware of the well-known technique to get GNU make to do a single command to update multiple files by using pattern rules:

output/%.xhtml: input/%.xhtml
    saxon s:input -o:output foo.xslt

But in my case this suffers from two problems. First, it will run the transform on all files in the input directory, not just the ones that have changed; and second, it will not limit the transform to the subset of files specified in $(FILES). The GNU make feature of running a recipe given in a pattern rule only once for all matched targets does not work in the case of so-called "static pattern rules" (see [here]), as the rule given at the top of the post is known.

In order to use the saxon batching feature, I need to create a temporary directory, copy to it only those files to be processed, then run the transform with that temporary directory as the input directory. I tried creating a temporary directory, and remember its name using a target-specific variable for future use, using

$(FILES): TMPDIR:=$(shell mktemp -d)

but this creates a new temporary directory for every single target that is out-of-date. In any case, I'm not sure how to structure the rule that would then copy the necessary files into that directory. I don't want to create the temporary directory at the time the makefile is parsed, since I have a non-recursive make system that will parse all make files, even those not related to the current top-level target, and don't want to create the temporary directory for situations in which it is not necessary/will not be used.

I'm well aware that many questions have been asked on SO in the past about creating multiple files from a single input; one solution is (non-static) pattern rules; other solutions involve phony targets. However, in this case I'm stuck as to how to put all this together.

I can identify the files that changed and copy them using the static pattern rule

$(FILES): output/%.xhtml : input/%.xhtml
    TMPDIR=`mktemp -d`
    cp $< $(TMPDIR)

but actually I would prefer to copy the files with a single cp command, whereas this copies them one by one. Perhaps there is some application here of cp -u?

I also considered using an ad-hoc extension for those files needing updating but could not see how to get this to work either. I'm about to give up and just run the saxon transform on all files when any of them have changed, but is there any better way?


回答1:


Personally, I wouldn't try to do this from the command line. That's partly because I'm not a shell scripting wizard. I'm not an Ant wizard either, but because the requirement is to process files that haven't changed, this seems to fall very much into Ant territory. On the other hand, Ant will recompile the stylesheet for each transformation, which is an overhead you might want to avoid; if that's the case then your best bet is probably to write a little Java application. It's probably only 100 lines or less.

Final possibility is to do the processing within Saxon: that is, a single transformation that reads multiple input files using the collection() function and generates multiple result files using xsl:result-document. Saxon (commercial editions) offers an extension function last-modified that allows you to filter the files to be processed. With 1000 files you might also want the extension function saxon:discard-document() to prevent the heap filling.




回答2:


Personally, I like your original one-compiler-per-file formulation. Does not this work well with make's -j n flag?

You can of course batch up files by copying, and then running saxon at the end. Recursive make (ugh!) can sort out the ordering. Something like:

.PHONY: all
all:
    rm -rf tmpdir
    ${MAKE} tmpdir/sentinel
    saxon -s:tmpdir -o:output foo.xslt

tmpdir/sentinel: $(FILES) ; touch $@

$(FILES): output/%.xhtml: input/%.xhtml
    ln $< $(patsubst input/%,tmpdir/%,$<)

This does work, though I am very queasy about lying to make (the static pattern rule purports to create the target in output/, but in fact does its dirty deed in tmpdir/).

Note in the recipe for tmpdir/sentinel, that $? is correctly set to the list of output files that are out of date. This might be useful if you can pass a bunch of files to saxon rather than a folder.




回答3:


I think one issue here is that 'saxon' supports either one file or all files in a directory, so isn't suitable for batch processing without copying to temporary directories.

Otherwise, this is quite simple to do by using a timestamp marker file as a proxy target. For example:

output/.timestamp : $(FILES)
    mkdir -p $(@D)
    $(COMMAND) -outputdir=output $?
    touch $@

The three commands are:

  1. Ensure that the output directory exists.
  2. Run the batch command on files newer than the timestamp file.
  3. Update the timestamp file (creating it if necessary).

Remembering that each line of a command is executed in its own subshell, and that if any command line fails, then subsequent lines are not invoked.

This approach is useful with Java builds.



来源:https://stackoverflow.com/questions/14914870/making-multiple-files-from-multiple-files-with-one-command-in-gnu-make

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!