问题
I am writing a Glue code and using dynamic frame Api resolve choice , specs . I am trying to cast the source by passing casting when dynamic frame is created from catalog.
I have successfully implemented the casting via resolve choice specs but while casting date i am getting null values , just wanted to understand how can we pass date with source format in casting.
self.df_TR01=self.df_TR01.resolveChoice(specs=[('col1', 'cast"string'), ('col2_date', 'cast:date')]).toDF()
But in col2_date i am getting null value and i am trying to understand how can i pass date with source format in the above statement.
回答1:
I encountered something similar but my issue before when writing dates to Redshift, they were also landing as nulls. In my case, I used the following and it helped me resolve the issue. Maybe this will help.
from datetime import datetime
def fix_dates(m):
m["col2"] = datetime.strptime(m["col2"], "m/d/yy")
return m
custommapping1 = Map.apply(frame = datasource0, f = fix_dates, transformation_ctx = "custommapping1")
Alternatively you can make use of spark sql, for example:
datasource0.toDF().createOrReplaceTempView("my_temp_view")
df_cols = spark.sql("""
select to_date(cast(unix_timestamp(col2, 'M/d/yy') as timestamp)) as col2 from my_temp_view """)
ResolveChoice is normally able to deal with most ambiguities. Can you share a sample date that is failing to cast correctly, maybe I could also try on my end.
来源:https://stackoverflow.com/questions/62636191/dynamic-frame-resolve-choice-specs-date-cast