问题
In my PowerBI desktop, I have table that is calculated from over other tables with a structure like this:
Input table:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>Firstname</th>
<th>Email</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scott</td>
<td>ABC@XYZ.com</td>
</tr>
<tr>
<td>Bob</td>
<td>ABC@XYZ.com</td>
</tr>
<tr>
<td>Ted</td>
<td>ABC@XYZ.com</td>
</tr>
<tr>
<td>Scott</td>
<td>EDF@XYZ.com</td>
</tr>
<tr>
<td>Scott</td>
<td>LMN@QRS.com</td>
</tr>
<tr>
<td>Bill</td>
<td>LMN@QRS.com</td>
</tr>
</tbody>
</table>
Now, I want to keep only the first record for each unique email. My expected output table using DAX is:
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th>Firstname</th>
<th>Email</th>
</tr>
</thead>
<tbody>
<tr>
<td>Scott</td>
<td>ABC@XYZ.com</td>
</tr>
<tr>
<td>Scott</td>
<td>EDF@XYZ.com</td>
</tr>
<tr>
<td>Scott</td>
<td>LMN@QRS.com</td>
</tr>
</tbody>
</table>
I was trying to use RANKX and FILTER, but not having any success.
回答1:
Sadly, the answer to this question is that there is no way in DAX to refer to the rows position relative to the other rows in the table. The only option is to use some column value for sorting purpose.
What we could do with the existing two columns table is to get the MAX or MIN Firstname per each Email. So we can write a calculated table like follows, where T
is the input table and T Unique
is the generated table.
T Unique =
ADDCOLUMNS(
ALL( T[Email] ),
"Firstname",
CALCULATE(
MAX( T[Firstname ] )
)
)
But this doesn't satisfy the requirement.
To obtain the desired result we need to add a column to the input table, with an index or a timestamp.
For this example I added an Index column using the following M code in Power Query, that is generated automatically by referencing the original table and then clicking on Add column -> Index column button
let
Source = T,
#"Added Index" = Table.AddIndexColumn(Source, "Index", 1, 1, Int64.Type)
in
#"Added Index"
So I obtained the T Index
table.
Now we can write the following calculated table that uses the new column to retrieve the first row for each Email
T Index Unique =
ADDCOLUMNS(
ALL( 'T Index'[Email] ),
"Firstname",
VAR MinIndex =
CALCULATE(
MIN( 'T Index'[Index] )
)
RETURN
CALCULATE(
MAX( 'T Index'[Firstname ] ),
'T Index'[Index] = MinIndex
)
)
that generates the requested table
In a real case scenario, the best place to add the new column is directly into the code that generates the input table.
来源:https://stackoverflow.com/questions/65363786/in-dax-not-powerquery-drop-duplicates-based-on-column