问题
i work on thhe example based in this web and here is i got after this
jobs_train, jobs_test = jobs_df.randomSplit([0.6, 0.4])
>>> zuckerberg_train, zuckerberg_test = zuckerberg_df.randomSplit([0.6, 0.4])
>>> train_df = jobs_train.unionAll(zuckerberg_train)
>>> test_df = jobs_test.unionAll(zuckerberg_test)
>>> from pyspark.ml.classification import LogisticRegression
>>> from pyspark.ml import Pipeline
>>> from sparkdl import DeepImageFeaturizer
>>> featurizer = DeepImageFeaturizer(inputCol="image", outputCol="features", modelName="InceptionV3")
>>> lr = LogisticRegression(maxIter=20, regParam=0.05, elasticNetParam=0.3, labelCol="label")
>>> p = Pipeline(stages=[featurizer, lr])
>>> p_model = p.fit(train_df)
and this is appeared
2018-06-08 20:57:18.985543: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
INFO:tensorflow:Froze 376 variables.
Converted 376 variables to const ops.
Using TensorFlow backend.
Using TensorFlow backend.
INFO:tensorflow:Froze 0 variables.
Converted 0 variables to const ops.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/opt/spark/python/pyspark/ml/base.py", line 64, in fit
return self._fit(dataset)
File "/opt/spark/python/pyspark/ml/pipeline.py", line 106, in _fit
dataset = stage.transform(dataset)
File "/opt/spark/python/pyspark/ml/base.py", line 105, in transform
return self._transform(dataset)
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_spark-deep-learning-0.1.0-spark2.1-s_2.11.jar/sparkdl/transformers/named_image.py", line 159, in _transform
File "/opt/spark/python/pyspark/ml/base.py", line 105, in transform
return self._transform(dataset)
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_spark-deep-learning-0.1.0-spark2.1-s_2.11.jar/sparkdl/transformers/named_image.py", line 222, in _transform
File "/opt/spark/python/pyspark/ml/base.py", line 105, in transform
return self._transform(dataset)
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_spark-deep-learning-0.1.0-spark2.1-s_2.11.jar/sparkdl/transformers/tf_image.py", line 142, in _transform
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_tensorframes-0.2.8-s_2.11.jar/tensorframes/core.py", line 211, in map_rows
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_tensorframes-0.2.8-s_2.11.jar/tensorframes/core.py", line 132, in _map
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_tensorframes-0.2.8-s_2.11.jar/tensorframes/core.py", line 66, in _add_shapes
File "/tmp/spark-74707b69-e8c9-498b-b0f2-b38828e5ad21/userFiles-ca1eb7cf-9785-441d-a098-54b62380bcee/databricks_tensorframes-0.2.8-s_2.11.jar/tensorframes/core.py", line 35, in _get_shape
File "/home/sulistyo/tensorflow/lib/python3.6/site-packages/tensorflow/python/framework/tensor_shape.py", line 900, in as_list
raise ValueError("as_list() is not defined on an unknown TensorShape.")
ValueError: as_list() is not defined on an unknown TensorShape.
please kindly help, thanks
回答1:
Use the following to read images and create your training & testing sets
from pyspark.sql.functions import lit
from sparkdl.image import imageIO
img_dir = "/PATH/TO/personalities/"
jobs_df = imageIO.readImagesWithCustomFn(img_dir + "/jobs",decode_f=imageIO.PIL_decode).withColumn("label", lit(1))
zuckerberg_df = imageIO.readImagesWithCustomFn(img_dir + "/zuckerberg", decode_f=imageIO.PIL_decode).withColumn("label", lit(0))
来源:https://stackoverflow.com/questions/50752787/valueerror-as-list-is-not-defined-on-an-unknown-tensorshape