How do I read a random line from one file?

前端未结

关注

 11  663

灰色年华

Is there a built-in method to do it? If not how can I do this without costing too much overhead?

相关标签:

11条回答

失恋的感觉

2020-12-04 20:06
```
import random
lines = open('file.txt').read().splitlines()
myline =random.choice(lines)
print(myline)
```
For very long file: seek to random place in file based on it's length and find two newline characters after position (or newline and end of file). Do again 100 characters before or from beginning of file if original seek position was <100 if we ended up inside the last line.

However this is over complicated, as file is iterator.So make it list and take random.choice (if you need many, use random.sample):
```
import random
print(random.choice(list(open('file.txt'))))
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

轻奢々

2020-12-04 20:07

If you don't want to load the whole file into RAM with f.read() or f.readlines(), you can get random line this way:

import os
import random


def get_random_line(filepath: str) -> str:
    file_size = os.path.getsize(filepath)
    with open(filepath, 'rb') as f:
        while True:
            pos = random.randint(0, file_size)
            if not pos:  # the first line is chosen
                return f.readline().decode()  # return str
            f.seek(pos)  # seek to random position
            f.readline()  # skip possibly incomplete line
            line = f.readline()  # read next (full) line
            if line:
                return line.decode()  
            # else: line is empty -> EOF -> try another position in next iteration

P.S.: yes, that was proposed by Ignacio Vazquez-Abrams in his answer above, but a) there's no code in his answer and b) I've come up with this implementation myself; it can return first or last line. Hope it may be useful for someone.

However, if you care about distribution, this code is not an option for you.

0 讨论(0)

天命终不由人

2020-12-04 20:10
You can add the lines into a set() which will change their order randomly.
```
filename=open("lines.txt",'r')
f=set(filename.readlines())
filename.close()
```
To find the 1st line:
```
print(next(iter(f)))
```
To find the 3rd line:
```
print(list(f)[2])
```
To list all the lines in the set:
```
for line in f:
    print(line)
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
暖寄归人

2020-12-04 20:11
Although I am four years late, I think I have the fastest solution. Recently I wrote a python package called linereader, which allows you to manipulate the pointers of file handles.

Here is the simple solution to getting a random line with this package:
```
from random import randint
from linereader import dopen

length = #lines in file
filename = #directory of file

file = dopen(filename)
random_line = file.getline(randint(1, length))
```
The first time this is done is the worst, as linereader has to compile the output file in a special format. After this is done, linereader can then access any line from the file quickly, whatever size the file is.

If your file is very small (small enough to fit into an MB), then you can replace dopen with copen, and it makes a cached entry of the file within memory. Not only is this faster, but you get the number of lines within the file as it is loaded into memory; it is done for you. All you need to do is to generate the random line number. Here is some example code for this.
```
from random import randint
from linereader import copen

file = copen(filename)
lines = file.count('\n')
random_line = file.getline(randint(1, lines))
```
I just got really happy because I saw someone who could benefit from my package! Sorry for the dead answer, but the package could definitely be applied to many other problems.
0 讨论(0)
发布评论:

提交评论
- 加载中...
梦毁少年i

2020-12-04 20:14
A slightly improved version of the Alex Martelli's answer, which handles empty files (by returning a default value):
```
from random import randrange

def random_line(afile, default=None):
    line = default
    for i, aline in enumerate(afile, start=1):
        if randrange(i) == 0:  # random int [0..i)
            line = aline
    return line
```
This approach can be used to get a random item from any iterator using O(n) time and O(1) space.
0 讨论(0)
发布评论:

提交评论
- 加载中...

醉话见心

2020-12-04 20:20

import random

with open("file.txt", "r") as f:
    lines = f.readlines()
    print (random.choice(lines))

0 讨论(0)

1 2 下一页