Cleaning up version-controlled directories
When writing LaTeX documents, you often end up with all kinds of redundant files. These files are generated during the compilation of a dvi or a pdf document and can generally be discarded afterwards. Most of my tex documents are under version control and, consequently, it is possible to get a list of the files that are (and are not) under this control. Using a single Bash command, you can abuse subversion to determine which files you want to delete:
Perhaps daunting at first, this command can be broken apart into understandable substatements. The first part (rm -i `<command>`) states that the result of the command between the apostrophes should be interpreted by the rm program.
The second part is a pipeline consisting of a svn command and an awk command. The svn command recursively generates a table with the version-controlled status of the directory tree:
It should be noted that the lines starting with a question mark denote the files that are not under version control.
This list is interpreted by awk. Now, here stuff gets interesting: awk is a program that can use regular expressions to parse text input, not unlike a compiler. It is especially optimized for processing tables, where cells are delimited by whitespace and end-of-line characters.
In this case, I have told awk to print the value in the second column (the filename) for every row the regular expression ^\? yields a result. The regular expression matches only table rows that start with a question mark. So this command generates a list of files that are not under version control. In line of the previously given example:
This list will be processed by the rm command, resulting in the deletion of the uncontrolled files.
Beware!
You have probably noted that two, apparently important files show in the list of files that are scheduled for deletion: data.tex and randomtext.cpp. If you forget to put critical files under version control (either by adding to the repository or setting the svn:ignore property), you risk losing them using this command. That's why it is probably best to keep the 'i' switch in the rm statement as a precaution. This will cause rm to prompt for a confirmation for every file to be deleted.
Also note that this command differs from the 'svn revert -R' command. This command reverts the version-controlled directory to it's original state, also removing freshly added files and reverting modified files.
Update: As the commenter below already implies, the command is a bit hard to read. Using the xargs program, the commandline can be simplified a little:
rm -i `svn status | awk '/^\?/ {print $2}'`Perhaps daunting at first, this command can be broken apart into understandable substatements. The first part (rm -i `<command>`) states that the result of the command between the apostrophes should be interpreted by the rm program.
The second part is a pipeline consisting of a svn command and an awk command. The svn command recursively generates a table with the version-controlled status of the directory tree:
A testjuh.txt ? data.tex ? data.aux M contents.tex ? contents.toc ? contents.toc.old ? contents.aux ? randomtext.cpp
It should be noted that the lines starting with a question mark denote the files that are not under version control.
This list is interpreted by awk. Now, here stuff gets interesting: awk is a program that can use regular expressions to parse text input, not unlike a compiler. It is especially optimized for processing tables, where cells are delimited by whitespace and end-of-line characters.
In this case, I have told awk to print the value in the second column (the filename) for every row the regular expression ^\? yields a result. The regular expression matches only table rows that start with a question mark. So this command generates a list of files that are not under version control. In line of the previously given example:
data.tex data.aux contents.toc contents.toc.old contents.aux randomtext.cpp
This list will be processed by the rm command, resulting in the deletion of the uncontrolled files.
Beware!
You have probably noted that two, apparently important files show in the list of files that are scheduled for deletion: data.tex and randomtext.cpp. If you forget to put critical files under version control (either by adding to the repository or setting the svn:ignore property), you risk losing them using this command. That's why it is probably best to keep the 'i' switch in the rm statement as a precaution. This will cause rm to prompt for a confirmation for every file to be deleted.
Also note that this command differs from the 'svn revert -R' command. This command reverts the version-controlled directory to it's original state, also removing freshly added files and reverting modified files.
Update: As the commenter below already implies, the command is a bit hard to read. Using the xargs program, the commandline can be simplified a little:
svn status | awk '/^\?/ {print $2}' | xargs rm -i|
|
Rabbit's Revenge |
|
|
Saxion en CAA, losmakelijk verbonden |
Comments
Or, you do this:
svn status | grep ^? | xargs rm

svn status | grep ^? | xargs rm
Nope, I believe that is not correct. Your solution does not only remove the unversioned files, but also files with a filename consisting of one character. Xargs expects its arguments delimited by whitespace (both eol and spaces), resulting in a whole batch of '?', the indicator from the svn status command.
Of course, this is no problem when you do not have files like that, but I think it should be noted that your solution has some side effects
.
Of course, this is no problem when you do not have files like that, but I think it should be noted that your solution has some side effects