问题
I have two dataframes with different numbers of lines.
X&Y are coordinates in 2D position
DF1:
X,Y,C
1,1,12
2,2,22
3,3,33
4,4,45
5,5,43
6,6,56
DF2:
X,Y
start squere next two X,Y
END squere
X,Y,X1,Y1
1,1,3,3
2,2,4,4
part of my code
:
A = (abs(DF1['X']).values > abs(DF2['X']).values)
B = (abs(DF1['Y']).values > abs(DF2['Y']).values)
C = (abs(DF1['X']).values < abs(DF2['X1']).values)
D = (abs(DF1['Y']).values < abs(DF2['Y1']).values)
RESULT = A & B & C & D
result=DF1[RESULT]
ALSO: i can use only 2 columns from DF2, and in RESULT will be used only A & B, its only example. Right now 2times X and Y showing me the range of values.
When DF2 have only one line, there is OK. But with more than one i have received:
ValueError: operands could not be broadcast together with shapes
I know that i need to create a rule that all lines will be compared, but i don't know how, i have tried with diff, but no good results.
OUTPUT: I need to delete this error and start using line by line. For each line in DF2 i need separate result: for line 1:
X,Y,C
2,2,22
For line 2
X,Y,C
3,3,33
And after each checking the line i need to save dataframes results to one file So in this example in one file there will be``
2,2,22
3,3,33
Thanks for advice
EDIT: for Tbaki
def isInSquare(row, df2):
df2=result_from_other_def1.df1
df1=result_from_other_def2.df2
if (row.X > df2.iloc[0].X) and (row.Y > df2.iloc[1].Y):
if (row.X < df2.iloc[0].X1) and (row.Y < df2.iloc[1].Y2):
if (row.X < df2.iloc[1].X) and (row.Y < df2.iloc[1].Y):
if (row.X > df2.iloc[0].X) and (row.Y > df2.iloc[1].Y2):
return True
return False
DF1.apply(lambda x: isInSquare(x,DF2),axis= 1)# if i will leave this line
here, tk inter will run it automaticly so i my opiniot this should be inside definition.
Also i dont know how many lines will be in DF1 and in DF2.
Thanks
回答1:
Check this code, checking for a 5x5 square.
DF1 = pd.DataFrame({"X":[1,2,3,4,5,6],"Y":[1,2,3,4,5,6],"C":[12,22,33,45,13,56]})
DF2 = pd.DataFrame({"X":[1,5],"Y":[1,1],"X1":[5,1],"Y1":[5,5]})
def isInSquare(row, df2):
c1 = (row.X > df2.iloc[0].X) and (row.Y > df2.iloc[0].Y)
c1 = c1 and (row.X < df2.iloc[0].X1) and (row.Y < df2.iloc[0].Y1)
c1 = c1 and (row.X < df2.iloc[1].X) and (row.Y > df2.iloc[1].Y)
c1 = c1 and (row.X > df2.iloc[1].X1) and (row.Y < df2.iloc[1].Y1)
return c1
DF_NEW = DF1[DF1.apply(lambda x: isInSquare(x,DF2),axis= 1)]
output
C X Y
1 22 2 2
2 33 3 3
3 45 4 4
if you want to keep the max C:
DF_NEW = DF_NEW.groupby(["X","Y"]).max().reset_index()
来源:https://stackoverflow.com/questions/44456526/comparing-values-from-different-dataframes-line-by-line-python