问题
I can send my data through CSV file. First, write my random numbers into CSV file then send it, but is it possible to send it directly? my socket code:
import socket
host = 'localhost'
port = 8080
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host, port))
s.listen(1)
while True:
print('\nListening for a client at',host , port)
conn, addr = s.accept()
print('\nConnected by', addr)
try:
print('\nReading file...\n')
while 1:
out = "test01"
print('Sending line', line)
conn.send(out)
except socket.error:
print ('Error Occured.\n\nClient disconnected.\n')
conn.close()
spark streaming code:
from pyspark import SparkContext
from pyspark.streaming import StreamingContext
sc = SparkContext("local[2]","deneme")
ssc = StreamingContext(sc, 10)
socket_stream = ssc.socketTextStream("localhost",8080)
random_integers = socket_stream.window( 30 )
digits = random_integers.flatMap(lambda line: line.split(" ")).map(lambda digit: (digit, 1))
digit_count = digits.reduceByKey(lambda x,y:x+y)
digit_count.pprint()
ssc.start()
回答1:
This is because socket blocks sending the data and never moves on. The most basic solution is to send some amount of data and close the connection:
import socket
import time
host = 'localhost'
port = 50007
i = 0
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((host, port))
s.listen(1)
try:
while True:
conn, addr = s.accept()
try:
for j in range(10):
conn.send(bytes("{}\n".format(i), "utf-8"))
i += 1
time.sleep(1)
conn.close()
except socket.error: pass
finally:
s.close()
To get something more interesting check non-blocking mode with timeouts.
来源:https://stackoverflow.com/questions/47726552/python-send-integer-or-string-to-spark-streaming