I am working on a program that take user input for two file names. Unfortunately, the program can easily break if the user does not follow the specified format of the input.
The standard way to deal with this kind of problem is utilising command-line options, not gathering input from STDIN. Getopt::Long comes with Perl and is servicable:
use strict; use warnings FATAL => 'all';
use Getopt::Long qw(GetOptions);
my %opt;
GetOptions(\%opt, 'qseq=s', 'barcode=s') or die;
die <<"USAGE" unless exists $opt{qseq} and $opt{qseq} =~ /^sample\d[.]qseq$/ and exists $opt{barcode} and $opt{barcode} =~ /^barcode.*\.txt$/;
Usage: $0 --qseq sample1.qseq --barcode barcode.txt
$0 -q sample1.qseq -b barcode.txt
USAGE
printf "q==<%s> b==<%s>\n", $opt{qseq}, $opt{barcode};
The shell will deal with any extraneous whitespace, try it and see. You need to do the validation of the file names, I made up something with regex in the example. Employ Pod::Usage for a fancier way to output helpful documentation to your users who are likely to get the invocation wrong.
There are dozens of more advanced Getopt modules on CPAN.
You'll need to trim spaces before handling the filename data in your routine, you could check the file extension with yet another regular expression, as nicely described in Is there a regular expression in Perl to find a file's extension?. If it's the actual type of file that matters to you, then it might be more worthwile to check for that instead with File::LibMagicType.
First, put use strict;
at the top of your code and declare your variables.
Second, this:
# remove the ',' and put the files into an array separated by spaces; indexes the files
push @filename, join(' ', split(',', $filenames))
Is not going to do what you want. split() takes a string and turns it into an array. Join takes a list of items and returns a string. You just want to split:
my @filenames = split(',', $filenames);
That will create an array like you expect.
This function will safely trim white space from the beginning and end of a string:
sub trim {
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
Access it like this:
my $file = trim(shift @filenames);
Depending on your script, it might be easier to pass the strings as command line arguments. You can access them through the @ARGV array but I prefer to use GetOpt::Long:
use strict;
use Getopt::Long;
Getopt::Long::Configure("bundling");
my ($qseq_filename, $barcode);
GetOptions (
'q|qseq=s' => \$qseq_filename,
'b|bar=s' => \$barcode,
);
You can then call this as:
./script.pl -q sample1.qseq -b barcode.txt
And the variables will be properly populated without a need to worry about trimming white space.
While I think your design is a little iffy, the following will work?
my @fileNames = split(',', $filenames);
foreach my $fileName (@fileNames) {
if($fileName =~ /\s/) {
print STDERR "Invalid filename.";
exit -1;
}
}
my ($qsec, $barcode) = @fileNames;
And here is one more way you could do it with regex (if you are reading the input from STDIN
):
# read a line from STDIN
my $filenames = <STDIN>;
# parse the line with a regex or die with an error message
my ($qseq_filename, $barcode) = $filenames =~ /^\s*(\S.*?)\s*,\s*(\S.*?)\s*$/
or die "invalid input '$filenames'";