I am having trouble SUMming a bag of values, due to a Data type error.
When I load a csv file whose lines look like this:
6 574 false 10.1.72.23
Have you tried to cast the data retrieved from the UDF? Applying the schema here does not perform any casting.
e.g.
logs_base = FOREACH raw_logs GENERATE FLATTEN( (tuple(LONG,LONG,CHARARRAY,....)) EXTRACT(line, '^...') ) AS (account_id: INT, ...);