Find content of one file from another file in UNIX

前端未结

关注

 8  561

I have 2 files. First file contains the list of row ID\'s of tuples of a table in the database. And second file contains SQL queries with these row ID\'s in \"where\" clause

相关标签:

8条回答

猫巷女王i

2020-12-01 08:23

The awk/grep solutions mentioned above were slow or memory hungry on my machine (file1 10^6 rows, file2 10^7 rows). So I came up with an SQL solution using sqlite3.

Turn file2 into a CSV-formatted file where the first field is the value after ri=

cat file2.txt  | gawk -F= '{ print $3","$0 }' | sed 's/;,/,/' > file2_with_ids.txt

Create two tables:

sqlite> CREATE TABLE file1(rowId char(10));
sqlite> CREATE TABLE file2(rowId char(10), statement varchar(200));

Import the row IDs from file1:

sqlite> .import file1.txt file1

Import the statements from file2, using the "prepared" version:

sqlite> .separator ,
sqlite> .import file2_with_ids.txt file2

Select all and ony the statements in table file2 with a matching rowId in table file1:

sqlite> SELECT statement FROM file2 WHERE file2.rowId IN (SELECT file1.rowId FROM file1);

File 3 can be easily created by redirecting output to a file before issuing the select statement:

sqlite> .output file3.txt

Test data:

sqlite> select count(*) from file1;
1000000
sqlite> select count(*) from file2;
10000000
sqlite> select * from file1 limit 4;
1610666927
1610661782
1610659837
1610664855
sqlite> select * from file2 limit 4;
1610665680|update TABLE_X set ATTRIBUTE_A=87 where ri=1610665680;
1610661907|update TABLE_X set ATTRIBUTE_A=87 where ri=1610661907;
1610659801|update TABLE_X set ATTRIBUTE_A=87 where ri=1610659801;
1610670610|update TABLE_X set ATTRIBUTE_A=87 where ri=1610670610;

Without creating any indices, the select statement took about 15 secs on an AMD A8 1.8HGz 64bit Ubuntu 12.04 machine.

0 讨论(0)

不要未来只要你来

2020-12-01 08:34
Maybe try AWK and use number from file 1 as a key for example simple script

First script will produce awk script:
awk -f script1.awk
```
 {
   print "\$0 ~ ",$0,"{ print \$0 }" > script2.awk;
 }
```
and then invoke script2.awk with file
0 讨论(0)
发布评论:

提交评论
- 加载中...

上一页 1 2