Find content of one file from another file in UNIX

前端未结

关注

 8  560

I have 2 files. First file contains the list of row ID\'s of tuples of a table in the database. And second file contains SQL queries with these row ID\'s in \"where\" clause

相关标签:

8条回答

北海茫月

2020-12-01 08:10

## reports any lines contained in < file 1> missing in < file 2>

IFS=$(echo -en "\n\b") && for a in $(cat < file 1>); 
do ((\!$(grep -F -c -- "$a" < file 2>))) && echo $a; 
done && unset IFS

or to do what the asker wants, take off the negation and redirect

(IFS=$(echo -en "\n\b") && for a in $(cat < file 1>); 
do (($(grep -F -c -- "$a" < file 2>))) && echo $a; 
done && unset IFS) >> < file 3>

0 讨论(0)

一整个雨季

2020-12-01 08:18

You don't need regexps, so grep -F -f file1 file2

0 讨论(0)
发布评论:

提交评论
- 加载中...
庸人自扰

2020-12-01 08:19
Most of previous answers are correct but the only thing that worked for me was this command
```
grep -oi -f a.txt b.txt
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
情话喂你

2020-12-01 08:20
One way with awk:
```
awk -v FS="[ =]" 'NR==FNR{rows[$1]++;next}(substr($NF,1,length($NF)-1) in rows)' File1 File2
```
This should be pretty quick. On my machine, it took under 2 seconds to create a lookup of 1 million entries and compare it against 3 million lines.

Machine Specs:
```
Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz (8 cores)
98 GB RAM
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
长发绾君心

2020-12-01 08:20
I may be missing something, but wouldn't it be sufficient to just iterate the IDs in file1 and for each ID, grep file2 and store the matches in a third file? I.e.
```
 for ID in `cat file1`; do grep $ID file2; done > file3
```
This is not terribly efficient (since file2 will be read over and over again), but it may be good enough for you. If you want more speed, I'd suggest to use a more powerful scripting language which lets you read file2 into a map which quickly allows identifying lines for a given ID.

Here's a Python version of this idea:
```
queryByID = {}

for line in file('file2'):
  lastEquals = line.rfind('=')
  semicolon = line.find(';', lastEquals)
  id = line[lastEquals + 1:semicolon]
  queryByID[id] = line.rstrip()

for line in file('file1'):
  id = line.rstrip()
  if id in queryByID:
    print queryByID[id]
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

爱一瞬间的悲伤

2020-12-01 08:23

I suggest using a programming language such as Perl, Ruby or Python.

In Ruby, a solution reading both files (f1 and f2) just once could be:

idxes = File.readlines('f1').map(&:chomp)

File.foreach('f2') do | line |
  next unless line =~ /where ri=(\d+);$/
  puts line if idxes.include? $1
end

or with Perl

open $file, '<', 'f1';
while (<$file>) { chomp; $idxs{$_} = 1; }
close($file);

open $file, '<', 'f2';
while (<$file>) {
    next unless $_ =~ /where ri=(\d+);$/;
    print $_ if $idxs{$1};
}
close $file;

0 讨论(0)

1 2 下一页