问题
How can I write the following code in java? If I have list of records/dicts in java how can I write the beam code to write them in tfrecords where tf.train.Examples are serialized. There are lot of examples to do that with python, below is one example in python, how can I write the same logic in java ?
import tensorflow as tf
import apache_beam as beam
from apache_beam.runners.interactive import interactive_runner
from apache_beam.coders import ProtoCoder
class Foo(beam.DoFn):
def process(self, element, *args, **kwargs):
import tensorflow as tf
foo = element.get('foo')
bar = element.get('bar')
feature = {
"foo":
tf.train.Feature(bytes_list=tf.train.BytesList(value=[foo.encode('utf-8')])),
"bar":
tf.train.Feature(bytes_list=tf.train.BytesList(value=[bar.encode('utf-8')]))
}
example_proto = tf.train.Example(features=tf.train.Features(feature=feature))
yield example_proto
p = beam.Pipeline(runner=interactive_runner.InteractiveRunner())
records = p | "Create records" >> beam.Create([{'foo': 'abc', 'bar': 'pqr'} for _ in range(10)])
tf_examples = records | "Convert to tf examples" >> beam.ParDo(Foo())
tf_examples | "Dump Records" >> beam.io.WriteToTFRecord(file_path_prefix="./output/data-",
coder=ProtoCoder(tf.train.Example()),
file_name_suffix='.tfrecord', num_shards=2)
p.run()
回答1:
I have attempted this with java but I am still getting some issues, The link to new to question is here Write tfrecords from beam pipeline?.
来源:https://stackoverflow.com/questions/61247661/writing-tfrecords-in-apche-beam-with-java